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Abstract 


This  program  hcis  investigated  the  use  of  limit  cycles  to  represent  and  processing  sym¬ 
bolic  information  in  the  context  of  an  inference  machine.  This  approach  was  proposed  as  a 
means  of  overcoming  problems  with  fault  tolerance  and  relatively  small  space-bandwidth 
products  in  current  spatial  light  modulator  (SLM)  technology.  The  program  has  focused 
on  developing  a  storage  medium  with  many  limit  cycles  (oscillatory  modes)  available  and 
a  method  for  coupling  the  various  modes  in  a  desired  way.  Because  of  their  flexibility, 
neural  network  ideas  were  used  as  the  basis  for  the  components  and  algorithms  developed. 

In  the  theoretical  realm,  the  program  has  had  many  accomplishments.  First,  the 
self-oscillating  neural  network  (SONN)  model  was  developed  and  characterized  as  the 
oscillatory  medium.  This  model  was  designed  with  optical  spatial  SLMs  in  mind  and 
does  not  require  any  training  or  programming.  Furthermore,  it  is  highly  tolerant  of 
static  parameter  variations  inherent  in  the  optics. 

Next,  the  .spectral  back- propagation  (SBP)  training  algorithm  was  developed  with 
complete  generality  as  a  means  of  forming  the  coupling  trajectories.  This  algorithm 
trains  input-output  sequences  intu  a  network  using  an  error  criterion  based  on  a  Fourier 
series  decomposition  of  the  sequences.  The  method  allows  the  interconnects  to  have 
trainable  time  delays  in  addition  to  the  weights.  TUs^xap^Uity  proved^ very  beneficial 
when  developing  a  transition detecting  network  fm  realizing  the  mode  couplings.  The 
algorithm  also  allows  the  cells  to  have  finite  bandwidth. 

Both  the  SONN  and  the  SBP  algorithm  v/cre  combined  to  demonstrate  a  simple 
symbolic  processing  system  based  on  limit  cycles.  The  chosen  paradigm  was  a  finite 
state  machine  (FSM),  a  simple  starting  point  for  building  up  to  a  complete  inference 
machine. 

With  respect  to  optics,  an  optical  architecture  for  the  SONN  model  was  designed. 
This  architecture  is  efrecti\cl>  a  specialized  optical  neural  network  based  on  holographic 
interconnects  between  SLMb.  To  satisify  architectural  demands,  the  program  has  devel¬ 
oped  a  method  for  generating  interconnection  hologram^  u.sing  a  computer  and  current 
color  printer  technology.  The  holograms  are  phasc-onK,  liace  \er\  high  olficiency,  require 
low  cost  processing  facilities,  and  arc  expediently  made. 
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1  Introduction 


The  goal  of  this  research  program  was  to  investigate  the  use  of  optics  in  symbolic  pro¬ 
cessing  systems,  and  in  particular,  inference  machines.  These  systems  store  information 
in  the  form  of  relationships  between  symbols,  usually  arranged  cis  a  knowledge  base  of 
rules.  Their  function  is  to  infer  the  answers  to  queries  of  the  knowledge  base  by  searching 
through  the  set  of  relationships.  The  structure  and  function  of  an  inference  machine  and 
the  necessary  considerations  for  an  optical  implementation  are  discussed  in  more  detail 
in  Section  8. 

The  operational  requirements  for  an  inference  machine  are  (1)  to  store  many  rules  (i.e., 
have  a  large  knowledge  base  capacity)  and  (2)  to  search  the  knowledge  base  very  quickly. 
Unlike  numerical  processors  such  as  matrix-vector  multipliers,  inference  machines  do 
not  require  large  dynamic  range.  Thus,  the  parallelism  and  speed  of  optics  offered  an 
attractive  implementation  technology. 

The  initial  optical  architectures  for  an  optical  inference  machine  were  based  on  thresh¬ 
olding  matrix- vector  multipliers.  These  designs  are  described  in  more  detail  in  Section  9. 
Unfortunately,  these  architectures  suffered  from  two  major  limitations:  small  capacity 
and  low  fault  tolerance.  A  rule  was  represented  by  a  matrix  that  encoded  a  particular 
relationship  between  vectors  of  symbols.  In  the  optical  implementation,  the  rule  matrices 
were  stored  on  spatial  light  modulators  (SLMs).  Therefore,  the  size  of  the  rule  matrices 
and  thus  the  symbol  vectors  was  limited  by  the  size  of  current  SLM  technology.  Further¬ 
more,  if  a  pixel  on  the  SLM  failed,  the  corresponding  element  in  all  the  rule  matrices 
would  be  altered  permanently,  thus  causing  all  rules  to  be  changed  in  an  undesired  way. 

We  proposed  to  consider  a  very  different  approach  to  designing  an  inference  machine 
for  an  optical  implementation  in  hopes  of  solving  these  two  problems,  instead  of  repre¬ 
senting  symbols  using  fault- intolerant  vectors  (fixed  points),  we  have  investigated  the  use 
of  limit  cycles  for  this  role.  This  form  constitutes  a  dynamic,  temporal  representation  of 
information.  A  known  disadvantage  of  this  method  is  reduced  access  time  since  multiple 
points  along  a  trajectory  must  be  observed  in  order  to  recognize  the  current  cycle  (if  one 
is  even  active). 

In  order  to  construct  an  inference  machine  based  on  limit  cycles,  two  fundamental 
components  are  needed.  The  first  one  is  a  inediu.m  with  many  limit  cycles  (i.e.,  oscillatory 
modes)  available.  Each  cycle  would  represent  a  different  symbol  or  a  logical  state  of  the 
machine.  Ideally,  this  component  would  not  require  any  programming  or  training.  We 
successfully  developed  the  self-oscillating  neural  network  (SOXX)  model  for  this  purpose. 
This  model  is  designed  with  an  optical  SLM  implementation  in  mind.  It  is  described  in 
Section  2. 

The  second  component  is  a  method  for  coupling  the  cycles.  Ideally,  this  method  (1) 
is  not  dependent  upon  cycle  shapes,  (2)  can  be  reprogrammed  or  retrained  as  needed  or 
desired,  and  (3)  does  not  perform  any  explicit  conversions  from  a  limit  cycle  (LC)  to  a 
fixed  point  (FP).’  We  developed  the  spectral  back-propagation  (SBP)  training  algorithm 
and  a  transition-detecting  network  for  this  purpose.  Functionally,  a  transition  detector 

'This  requirement  is  not  necessary  for  a  practical  machine.  It  is  included  here  to  focus  on  tlic  study 
of  using  oscillatory  phenomena  to  represent  and  process  information. 


9 


achieves  the  first  two  characteristics,  but  not  the  third  one.  The  SBP  algorithm  is 
described  with  complete  generality  in  Section  3 

We  demonstrated  these  concepts  by  siniulating  a  finite  state  machine  (FSM)  based 
on  limit  cycles  (an  LC-FSM).  This  computation  paradigm  offers  a  simple  starting  point 
for  building  up  to  a  complete  inference  machine.  The  simulatedTiC-FSM  used  the  S0NN 
both  as  an  associative  memory  for  limit  cycles  and  as  an  input  source  for  the  LC-FSM. 
In  addition,  the  LC-FSM  illustrated  SBP-trained  transition  detectors  for  recognizing 
transition  conditions.  This  work  is  discussed  in  Section  4. 

The  optical  implementation  considerations  have  focused  on  the  SONN  model  because 
of  the  latter’s  robustness.  An  optical  architecture  for  the  SONN  has  been  designed  and 
is  discussed  in  Section  5.  Besides  the  SLMs,  the  key  component  of  this  architecture 
are  the  interconnect  holograms.  Practical  constraints  require  high  efficiency  interconnect 
holograms  which  motivated  the  development  of  a  new  approach  to  computer  generating 
interconnection  holograms.  This  process  we  have  developed  takes  advantage  of  current 
color  primer  technology  as  a  mechanism  for  modulating  the  exposure  of  black  and  white 
film.  Computer  generated  color  masks  representing  desired  phcise-only  functions  were 
photoreduced  onto  high  resolution  black  and  white,  and  after  developing  and  bleach 
processing,  each  of  the  printer  colors  map  to  discrete  phase  levels  as  required  by  the 
specified  phcise-only  function.  A  total  of  8  colors  yielded  a  total  of  8  discrete  phcise 
levels  which  were  used  to  construct  arbitrary  synthetic  blazed  grating  interconnections 
exhibiting  efficiencies  of  atleast  50%.  In  addition,  a  computer-controlled  system  for 
characterizing  SLM  parameters  was  developed.  This  work  also  is  discussed  in  Section  5. 

2  Seif-Oscillating  Neural  Network  Model 

The  SONN  model  is  an  oscillatory  medium  with  many  modes  (i.e.,  limit  cycles)  available 
naturally.  No  training  or  programming  is  required.  Furthermore,  the  existence  of  the 
modes  is  highly  tolerant  of  static  parameter  variations  in  the  network  parameters. 

The  structure  of  the  SON.N  in  shown  in  Figure  1.  The  network  is  a  hierarchical 
arrangement  of  smaller  feedforward  networks  called  levels.  Connections  going  up  the 
hierarchy  are  excitatory  with  value  Ci,  =  Similarly,  connections  going  down  the 
hierarchy  are  inhibitory  with  value  —//£,  =  —!. 

Each  level  uses  the  off-center,  on-surround  canonical  interconnect  topology  shown  in 
Figure  2.  The  central  inhibitory  interconnect  has  a  weight  — //;  =  —  1  and  the  off-center 
excitatory  interconnects  have  weights  Ci  =  For  the  levels  shown  in  Figure  1,  these 
values  correspond  to  a  ratio  of  the  total  excitation  to  the  total  inhibition  of  Simulations 
showed  that  this  ratio  value  is  a  good  choice  for  robust  oscillatory  behavior  to  exist. 

The  cells  in  the  SONN  follow  the  model  shown  in  Figure  3-  They  are  governed  by 
the  discrete-time  equations, 


j 

(1) 

v,[i]  =  -b  (1 -«».)«, -[t], 

(2) 
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Figure  1:  Self-oscillating  neural  network  (SONN).  Solid  lines  indicate  excitatory 
interconnects  whereas  shaded  lines  denote  inhibitory  interconnects. 
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M6C1-40X.NET  Cycle  Continuum 


Each  Input:  0.0  to  0.3  in  steps  of  0.02 

xi 


Figure  5:  Sampling  of  the  continuum  of  cycles  available  in  the  SONN  with  constant 
inputs.  The  weights  Were  perturbed  randomly  by  ±20%  from  the  nominal  values. 
Similarly,  the  time  delays  were  randomly  selected  from  {0.1. 'i..'}.  1}  iterations. 


where  «,(tj  is  the  weighted  iiipnl  sum  for  the  i‘**  cell.  r,{t)  is  the  filierecJ  input  sum 
with  damping  factor  Oo,  y,(tj  is  Oic  cell  output  as  formed  by  the  oiilpuL  function 
W..J  is  the  weight  associated  with  ihc  interconnect  from  the  j'**  cell,  and  Tf.  is  the  lime 
delay  associated  with  the  same  interconnect.  The  nominal  shape  for  a  .sigmoidal  output 
function  is  shown  in  Figure  1.  For  .simplicity,  the  lime  »lclay.s.  if  present,  are  taken  to  be 
integers  here.  In  the  simulations,  the  damping  factor  v.a.s  .set  to  0.7  wlsich  corresponded 
to  an  cfTective  lime  constant  of  approximately  2.S  iterations. 

A  mode  can  be  selected  with  eiiher  a  constant  (i.e..  FP)  or  cyclical  fi.e.,  LC)  input. 
Using  constant  inputs,  a  .sampling  of  the  cycles  available  Is  .vliowii  in  Figure  5.  The 
particular  SONN  that  generated  these  cyt Ira  had  its  weights  randomly  perturbed  by  up 
to  ±20%  from  their  nominal  values.  Fiirthcrmore.  the  network  had  a  distribution  of  lime 
delays  throughout  the  network  with  T,j  ranging  from  0  to  1  ilcralions  with  a,  mean  of  2 
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iterations.  The  periods  of  the  cycles  generated  by  the  SONN  used  in  Figure  5  were  all 
approximately  64  iterations. 

Simulations  showed  that  (1)  variations  in  the  interconnect  time  delays  create  diverse 
cycle  shapes  and  (2)  the  SONN  easily  can  tolerate  static  variations  of  over  ±20%  in  its 
network  parameters  (weights,  sigmoid  maxima  and  gains,  etc.). 

The  optical  implementation  of  the  SONN  is  discussed  in  Section  5. 

3  Spectral  Back-Propagation  Training  Algorithm 

3.1  Overview 

The  SBP  training  algorithm  is  an  extension  of  the  convents  nal  back-propagation 
method  [1]  for  training  a  neural  network  to  learn  a  set  of  smooth  input-output  sequences. 
It  can  adapt  both  the  weights  and  the  time  delays  or  any  combination  thereof.  It  can  work 
with  either  feedforward  or  recurrent  networks.  In  addition,  the  cells  can  have  infinite  or 
finite  bandwidth  using  a  first  order  approximation  as  in  Equation  (2). 

The  algorithm  has  been  demonstrated  successfully  using  computer  simulations  for 
sevcial  different  cases.  It  has  trained  a  simple  recurrent  network  with  an  infinite  impulse 
response  (HR)  to  learn  continuous-time  cycles.  Similarly,  it  can  train  either  the  weights 
or  the  time  delays  of  a  finite  impulse  response  (FIR)  network.  Finally,  it  can  train  a 
conventional  feedforward  network  with  vector  input  and  output  patterns. 

3.2  Mathematical  Basis  for  the  Algorithm 

Functionally,  the  SBP  algorithm  compares  the  spectral  decomposition  of  the  actual  out- 


put  sequences  to  that  of  the  desired  output  sequences  to  form  an  error  measruement  for 
driving  the  adaptation. 

Consider  the  continuous-time  form  of  the  cell  equations  given  in  (l)-(3): 

i 

('0 

Tv  =  V,-(0  +  U,-(0, 

(5) 

vM  = 

(6) 

where  a„  =  Assuming  the  output  y,{t)  is  smooth  and  “slowly 

approximated  by  the  truncated  Fourier  series, 

’  varying,”  it  can  be 

K 

yi{i)  ~  E  [y^fccos(fca;o0  +  y;-;.sin(fcwo0]  , 

Jl;=0 

(7) 

where 

0 

=  ^y^2/.-(0cos(W)«^i, 

(8) 
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(9) 


/9 

=  ^/  yi{t)sm{kujot)dt, 


A-  = 


1  for  k  =  0, 

2  for  k  >  0, 


and  Wo  =  ^  where  To  is  the  period  of  the  current  output  sequence.  Using  the  vector 
notation 

K-fc  =  ,  (12) 


the  cell  equations  can  be  transformed  into  the  Fourier  domain,  resulting  in: 


Uik  =  Xi,+Y: 


cos{ku}oTij)  -  s\n{ku}oTij) 
j  s\n{ku!QT{j)  cos{kuJoT{j) 


.  1  -kujoT^ 

Vik  =  TTT] - ^  -Uik. 

1  +  (k^„T,Y  J 


a  ^0  cos(kujoi) 

y.k  =  ^  /  !/,(n  dt.  (15) 

^  0  sin(A:wot) 

For  linear  cells  with 

i/,(0  =  Tn;Vi(l)  (16) 

where  m,-  is  a  gain  constant,  the  spectral  cell  output  is  simply 

Vik  =  (17) 

Note  that  the  transformation  into  the  Fourier  domain  causes  the  time  delays  T,j  to 
become  a  simple  quadrature  phase  matrix.  Similarly,  the  cell  time  constant  t„  becomes 
an  amplitude  scaling  factor  that  depends  on  the  spectral  frequency  component  ku;o- 
The  spectral  error  criterion  on  which  the  training  is  based  is  obtained  by  comparing 
the  actual  output  sequence  to  the  desired  output  sequence.  In  the  time  domain,  the  error 
cis  a  function  of  time  is 

e,(0  =  ’/'w  -  vM  (18) 
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where  i  refers  only  to  the  network  output  cells  here.  The  total  error  for  the  current 
output  sequence  over  all  output  cells  (No)  is  given  by 

1  1 

^  =  rEl^Alt)dt.  (19) 

1=1  ^ 

In  the  same  way,  the  total  error  over  all  output  sequences  is  simply  the  sum  of  E  for 
each  sequence.  Since  this  is  a  linear  operation,  the  derivation  below  will  be  done  as  if 
there  only  was  one  desired  output  sequence.  The  results  are  then  summed  over  all  output 
sequences  to  obtain  the  complete  error  criterion. 

Using  Parseval’s  theorem,  the  error  in  Equation  (19)  can  be  approximated  by 


where  E^f.  and  Eff.  are  the  Fourier  series  coefficients  of  the  temporal  error  sequence  defined 
in  Equation  (18).  As  in  the  conventional  back-propagation  algorithm,  the  weights  and 
time  delays  are  adapted  according  to  the  gradient  descent  driving  term, 

(21) 

where  Zij  is  either  Wij  or  Tij  and  is  the  adaptation  time  constant. 

The  spectral  cell  errors  can  be  defined  as 


^  (It 


=  -n 


-  A 

dz;<  - 


dE 

dE 
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allowing  the  driving  term  lo  be  written  as 


=  7  E 

Jt=0 


dz. 


(22) 


using  the  chain  rule.  Note  that  the  spectral  cell  errors  6,k  are  independent  of  z,_,,  the 
weight  or  time  delay  being  adapted.  - 

The  term  that  is  dependent  upon  z,j,  dVtkldz,^,  can  be  derived  from  the  spectral  cell 
equations  in  (13)  and  (14).  When  z,j  is  the  weight  iy,j,  dVikIdwij  is  given  by 


dVik  _ 

MTi)) 

dwij 

AuiTii)  A,t{Tii) 

yf>: 

(24) 
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where 


Aik{Tij)  =  2  cos(iu;olii)  “  ^^ot„  sin(^:a;or;j)]  (25) 

and  ^ 

A2k{T{j)  -  2  ^  ^2  [sin(i'Wo2?j)  +  kuoT^  cos(A:wo7{j)]  •  (26) 

These  coefficients  incorporate  the  effects  of  both  the  cell  filter  and  the  interconnect  time 
delays.  When  is  the  time  delay  Tij,  dVikjdTij  is  given  by 


[dYfk_ 

where 


To  J  (It  -ft 

0  sinfKWot) 


Given  the  expressions  for  dV,k/dz,j^oi\\y  the  spectral  cell  errors  need  to  be  determined 
in  order  to  compute  the  adaptation  driving  term  Az{j.  For  an  output  cell, 

4  =  E  Bi,,  ■  (29) 

<.-2=0 

in  general.  If  the  output  cell  is  linear  so  f/,(0  =  m,v,{i),  the  spectral  cell  error  simplifies 
to 

6ik  =  m{Eik  (30) 

where  E,k  is  the  set  of  Fourier  series  components  of  the  error  sequence  defined  in  Equa¬ 
tion  (18).  However,  if  the  out|)ul  cell  has  a  nonlinear  output  function  5,,  the  spectral 
components  of  t’,(/)  are  spread  across  the  frequency  spectrum.  This  effect  is  captured  by 
the  term. 


The  components  of  this  m.atrix  can  be  calculated  using: 

dYjl^  _  ^  ^ifc-A-2l  ,  ^ijh+ki) 
dVi%  2  ^\k-k2\  P{k+kt) 


dY,l 

dVi% 

dVii 

dYd. 

dVii 

dYik2  _  ^  ^^-<.-21  _  ^ffc+<--2) 

dVi%  2  0^k-k2\  %+fc2) 
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9yS:2  _  ^ 

Pk2  - 
2 

'V'a  Vis 

^i{k+k2)  1 

P(k+ki)  P(k-k2) 

SV/I, 

■  Pk2 

^{(k+ki)  ^Hk2-k) 

,  2 

P(k+k2)  P(k2-k) 

for  k> 


for  k  <  k2, 


(34) 


dVS. 


2  LAfc+fcj)  P(k-k2) 

0k2  ■ 

2 


\/f3  \//S 

^i(k-\-k2)  _J_  ^Hk2-k) 

P(k+k2)  P(k2-k) 


for  k  >  ^2? 


for  k  <  ^2) 


(35) 


where 


cos{kuot) 

sin{kuJot) 


dt 


(36) 


(37) 


and  5,'(u)  =  dSt{v)ldv.  Clearly,  the  nonlinear  case  involves  much  more  computation 
than  when  the  cells  are  linear. 

For  a  hidden  cell  (i.e.,  one  whose  output  is  not  an  output  of  the  network),  the  spectral 
cell  errors  can  be  computed  using  the  recursive  back-propagation  relationship, 


\  ^yfk^ 

^yfk^] 

T 

avf. 

^lfc2(To) 

^2A-2(To) 

=  E  ^0-  E 

1  k2=0 

pyfk2 

Aiaj(To) 

(38) 


When  the  hidden  cell  is  linear,  the  back-propagation  expression  can  be  simplified  to 

-A^kiTij)  A,, (To) 


=  E 


XUij 


J 

(39) 


Using  these  expressions,  the  adaptation  driving  term  As,;  can  be  computed  for  each 
adaptable  interconnect. 


3.3  The  Training  Process 

A  training  epoch  consists  of  cycling  through  all  sets  of  input-output  training  sequences. 
This  process  is  illustrated  in  Figure  6  for  two  training  sequences.  For  each  training  set. 
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the  period  To  is  set  to  be  the  |)criod  of  the  current  input  and  output  sequence.  Then,  the 
input  sequence  is  presented  to  I  he  network  limes  in  succession.  The  first  (n^  —  1)  cycles 
provide  the  transient  time  during  which  all  transients  should  decay  away.  During  the 
cycle,  the  various  Fourier  series  components  are  calculated  and  the  gradient  information 
in  A2r,_,  is  accumulated.  This  process  is  repeated  for  each  training  set,  at  the  end  of  which 
time,  the  weights  and/or  time  delays  are  updated  according  to  Equation  (21). 

If  rig  is  not  large  enough  for  transients  to  decay  away,  the  spectral  information  com¬ 
puted  during  the  Icist  cycle  will  not  be  valid.  The  resulting  adaptation  probably  will  not 
converge  onto  a  correct  or  even  usable  solution. 

3.4  Performance  Considerations 

Several  enhancements  can  be  made  to  the  SBP  algorithm  to  decrease  its  average  conver¬ 
gence  time.  First,  the  weights  and  time  delays  can  be  updated  after  each  training  set 
instead  of  after  all  training  sets  have  been  processed.  Second,  a  momentum  factor  can  be 
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introduced  into  the  adaptation  driving  term,  producing  a  new  driving  term  A^, '_,[<]  that 
is  given  by 

~  C):moinAir,-j-[t  —  1]  +  (1  —  Omom)  A'Z{j[t]  (40) 

where  Azij[t]  is  Az{j  at  time  step  t  and  Ofmom  is  the  momentum  factor.  Typically,  c^mom 
is  small  (0. 2-0.3).  The  weights  and  time  delays  are  then  updated  using 

*yW  =  2ul<-l|  +  (41) 

where  =  6“^/’’“. 

Another  enhancement  is  the  inclusion  of  variable  adaptation  gains,  so  rj  becomes 
That  is,  each  weight  and  time  delay  has  its  own  adaptive  gain  associated  with  it. 
We  used  the  SuperSAB  (Super  Self-Adapting  Back-propagation)  method  for  adjusting 
these  gains  [2]. 

3.5  Discrete-Time  Considerations 

The  discrete-time  form  for  the  cell  equations  is  given  in  Equations  (l)-(3).  However, 
other  approximations  are  necebsary  to  simulate  the  SBP  algorithm.  First,  the  Fourier 
series  components  are  computed  using  the  approximation, 

'‘7k  =  ^  ylijcosi/ccooOdi,  (42) 

-'o  ,=o 

'‘7k  =  Tp  5Z  y{t]sin{ku;oi)dt.  .  (43) 

■'0  <=o 

The  time  derivative  dy/dt  is  approximated  by  the  backward  difference  formula, 

(<H) 

Finally,  continuous  time  debt}'  can  be  approximated  by  a  linear  interpolator  between 
integral  points.  For  example,  the  vaiuc  of.  [/’  ^.3  iterations  ago  is  approximated  by 

%  (0.7)yjlt-.5)  -f  (0.3)yJt-6).  (45) 

In  general,  Ttj  can  be  decomposed  into  a  integral  part  and  a  fractional  part  such  that 

T,j  =  \;i\j\+6T,j  (46) 

where  is  the  largest  integer  less  than  or  equal  to  T,j  (i.e.,  the  truncation  function) 
and  6T,j  is  the  fractional  offset  (0  <  ST,j  <  1).  The  general  form  for  the  linear  interpolator 
is 
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3,6  Choosing  Parameters 


The  main  parameters  to  select  are  ng,  the  number  of  cycles  per  epoch  for  each  training  set, 
and  K,  the  largest  spectral  component  to  compute.  Several  factors  must  be  considered 
when  choosing  both  parameters. 

In  order  for  the  Fourier  spectra)  '?’vsis  to  be  valid,  the  network  must  be  in  a 
dynamic  steady  state.  Therefore,  n,  si.  .1  be  large  enough  so  all  transients  can  decay 
away.  However,  the  larger  rig  Is,  tl.  -j.t'M  ■'  ih-  runtime  because  each  epoch  takes  longer 
to  compute.  Thus,  the  selection  v  .  ’n  oe  an  iterative  process.  If  the  network  is 
linear,  an  eigenvalue  analysis  can  be  p*  lormed  for  a  reasonable  set  of  weights  and  time 
delays  to  determine  the  longest  time  <  >nstant.  Then,  initially  can  be  set  to,  say,  4 
times  this  duration.  In  general,  thoi.  ,  a  nonlinear  network  can  be  simulated  for  an 
initial  set  of  weights  and  time  del?.}.  the  response  time  can  be  measured.  Then, 
rig  can  be  set  to  some  multiple  of  this  time  constant.  During  the  training  process,  the 
transient  time  can  be  measured  to  see  if  ng  can  be  increased  or  decreased.  Ideally,  an 
adaptive  algorithm  could  be  er;,,doyed  to  adjust  rig  automatically,  although  we  did  not 
experiment  with  any  algorithms  during  the  program. 

By  comparison,  choosing  K  is  much  simpler,  but  the  computation-speed  tradeoff  still 
exists.  Ideally,  K  is  as  large  as  possible  to  represent  the  spectral  information  in  the 
sequences.  However,  as  K  increases,  the  compulation  border  also  increases,  especially 
for  nonlinear  networks.  A  working  guideline  is  that  in  order  avoid  aliasing  when  calcu¬ 
lating  the  Fourier  coefficients,  K  should  be  less  than  To/lO  vviiere  Tq  is  the  period  of  the 
shortest  training  sequence.  This  limit  also  places  an  effective  bandwidth  limitation  on 
the  sequence  to  make  it  “smooth”  and  “slowly  varying”  in  order  to  avoid  alia.sing  with 
the  spectral  measuromc,  ts,  even  when  enough  points  are  included  in  the  calculation. 

3.7  Time  Delay  Wrap-Around 

When  all  training  sequences  have  the  same  period  To,  the  time  delays  T,j  can  become 
“negative”  by  allowing  them  to  wrap  around  0  to  To-  Thus,  if  after  an  adaptation  pass 
Ttj  <  0,  the  actual  time  delay  can  be  set  to  To  -{-T,j.  This  wrap-around  may  lengthen 
the  transient  response  so  Ug  must  lx*  large  enough  to  accon.modatc  this  effect. 


3.8  Sample  Simulation  Results 

The  simple  recurrent  network  shown  in  Figure  7  was  used  to  test  the  SBP  algorithm  on 
an  IIR  network.  Both  the  weights  and  W2\  and  the  time  delays  Ti2  and  T21  were 
allowed  to  be  adaptable.  The  initial  weights  were  set  to  0.5  and  the  time  delays  to  0. 
The  input  interconnect  was  ii-xed  with  wio  =  1  and  Tio  =  0.  The  input  sequence  was 


.cjt)  =  sin 


(48) 


Figure  7:  A  simple  linear  filter/oscillator  with  an  infinite  impulse  response  (IIR). 


so  To  =  200  iterations.  The  desired  output  sequence  was  generated  by  the  network  by 


setting: 


T„  =  49.498, 

Wio  =  1, 

W2i  =  1, 

wn  =  -1, 

m;  =  1. 


q;„  =  0.98, 

Tio  =  0, 

Tn  =  30, 

=  30, 


A  total  of  five  spectr; '  components  were  computed,  so  /v  =  4.  The  number  of  cycles  per 
epoch  was  set  to  —  2,  allowing  for  1  transient  cycle.  The  evolution  of  the  training 
error  E  during  the  resulting  adaptation  is  shown  in  Figure  8.  The  evolutions  of  the 
weights  and  the  time  delays  are  shown  in  Figure  9.  The  initial  and  final  limit  cycles  are 
illustrated  in  Figure  10. 

These  plots  show  tlie  successful  adaptation  performed  by  the  SBP  algorith'n  fhe 
oscillations  apparent  in  the  evolution  of  the  training  error  and  the  weights  are  .  •'’d 

by  the  gradient  resets  made  by  the  SuperSAB  adaptive  gain  algorithm.  Sin<-e  on'y  one 
training  sequence  was  used  here,  the  time  dclavs  were  allowed  to  wrap  around  0  to  Tq. 
The  delay  Tu  takes  advantage  of  this  ability  as  illustrated  in  Figure  9(b).  If  the  wrap¬ 
around  is  disabled,  the  .adaptation  settles  into  an  unsatisfactory  local  minimum,  thus 
preventing  the  training  from  completing  successfully. 

Other  simulations  showed  that  various  combin<itions  of  the  weights  and  time  delays 
can  be  adapted.  However,  if  the  input  interconnect  [lUioiTio]  is  allowed  to  vary,  the 
adaptation  path  is  such  that  the  network  develops  an  eigenvalue  that  is  very  close  to 
1  in  magnitude.  The  resulting  long  time  constant  prevents  the  network  from  reaching 
a  steady  state  within  the  allocated  time.  Coincidentally,  the  adaptation  converges  to  a 
solution  but  this  solutionds  not  correct  because  it  cancels  out  the  long  transient  response. 

The  SBP  algorithm  also  has  been  demonstrated  on  the  FIR  network  shown  in  Fig¬ 
ure  11.  The  algorithm  has  trained  successfully  (1)  the  tap  weights  along  the  tapped  delay 
line  and  (2)  the  tap  time  delays  with  the  tap  weights  set  to  1.  These  cases  corresponding 
to  training  amplitude-only  and  phase-only  FIR  responses,  respectively. 
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Training  Epoch 


Figure  9:  The  evolution  of  (a)  the  weights  and  (b)  the  time  delays.  The  negative 
delays  indicate  wrap-around  has  occurred.  The  actual  delay  is  given  by  T12  +  To 
when  Ti2  <  0. 


Figure  10;  (a)  The  initial  (before  training)  and  desired  (final)  output  limit  cycles, 
(b)  Snapshots  of  8  output  cycles  taken  during  the  training  process  with  100  epochs 
between  cycles.  This  plot  shows  the  output  cycle  evolution  as  •u;i2,  tU2i»  T^,  and  T21 
are  adapted. 


Figure  11:  Basic  structure  of  a  finite  impulse  response  (FIR)  network  of  length  N. 


The  amplitude-only  case  provides  an  indication  of  the  strengths  and  weaknesses  of 
the  SBP  algorithm.  With  a  tapped  delay  line  with  N  =  25  cells,  the  impulse  response  h[t] 
shown  in  Figure  12(a)  was  used  as  the  desired  output  sequence  and  the  input  sequence 
was  a  single  unit  impulse  function.  The  total  length  of  li[t]  was  set  to  To  =  100  points  to 
allow  a  large  range  of  K  values  to  be  tested.  The  corresponding  spectral  coefficients  for 
h[t]  are  shown  in  Figures  12(b)  and  (c). 

With  K  =  6,  the  SBP  algorithm  can  train  the  FIR  network  so  that  only  the  first 
7  (include  k  =  0)  spectral  components  are  matched.  The  resulting  impulse  response 
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Figure  12:  The  desired  impulse  response  for  an  amplitude-only  FIR  network  and 
its  first  10  Fourier  series  coefficients,  (a)  The  first  30  points  in  /i(t].  The  remaining 
points  (70)  are  zero,  (b)  The  cosine  spectral  coefficients,  (c)  The  sine  spectral 
coefficients. 
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Figure  13:  The  learned  impulse  response  for  the  FIR  network  when  A'  =  6.  (a) 
The  evolution  of  the  total  training  error,  (b)  The  resulting  impulse  response  as 
compared  to  the  desired  A[lJ.  (c)  The  cosine  spectral  coefficients  of  the  actual  /ifl], 
(d)  The  corresponding  sine  spectral  coefficients. 
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Figure  14:  The  learned  impulse  response  for  the  FIR  network  when  I(  =  20.  (a) 
The  evolution  of  the  total  training  error,  (b)  The  resulting  impulse  response  as 
compared  to  the  desired  h[t].  (c)  The  cosine  spectral  coefficients  of  the  actual  h[t]. 
(d)  The  corresponding  sine  spectral  coefficients. 
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Figure  15:  Dual-output  XOR  network.  Cells  1,  2,  5,  and  6  are  linear  whereas  cells 
3  and  4  have  nonlinear  output  functions.  Cell  0  has  a  constant  output.  All  time 
delays  are  zero. 

is  close  to  the  desired  shape  but  the  trailing  edge  is  not  reproduced  faithfully.  When 
K  is  increased  to  20,  the  impulse  response  is  much  closer  to  the  desired  h[t],  including 
the  trailing  edge.  However,  when  l\  is  set  to  24,  there  are  only  Tq/K  =  100/24  «  4 
points  per  period  of  the  sampling  cosine  and  sine  functions  for  computing  the  spectral 
coefficients.  This  sampling  is  too  infrequent  and  results  in  an  aliasing  condition.  The 
adaptation  with  K  =  24  fails  to  converge  and  instead  causes  the  tap  weights  to  grow 
without  bound. 

In  a  phase-only  FIR  network,  the  SBP  algorithm  can  train  either  the  tap  time  delays 
or  those  in  the  delay  line.  IIowe\<’r.  bimulations  showed  that  the  trained  FIR  networks  did 
not  have  the  minimum  time  dehu.',  necessary  to  implement  the  filter.  Thus,  some  post¬ 
processing  of  the  resulting  time  delays  would  be  nece-ssary  to  realize  a  minimum-phase 
phase-only  FIR  network. 

The  SBP  algorithm  also  was  .successful  \a  training  a  network  to  learn  static  patterns. 
As  an  example,  the  network  in  Figure  15  was  trained  to  learn  the  dual-output  exclusive- 
OR  (XOR)  function  described  by  the  mapping: 

Input  Pattern  — »  Output  Pattern 


.-Cl  X2 

Vs  y& 

0  0 

0  1 

0  1 

1  0 

1  0 

1  0 

1  1 

0  1 

The  output  ?/5  is  the  XOR  output  and  t/o  is  its  complement. 
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With  a  feedforward  network  of  this  type,  the  number  of  cycles  per  epoch  (ue)  should 
be  set  to  the  number  of  layers  including  the  input  layer.  Thus,  Ug  was  set  to  3  for  this 
network.  The  input  layer  should  be  included  in  this  count  because  the  spectral  coefficients 
for  dyijdt  need  to  be  zero  for  static  training  patterns. 

3.9  Disadvantages  of  the  SBP  Algorithm 

The  disadvantages  of  the  SBP  algorithm  with  respect  to  the  conventional  back- 
propagation  algorithm  are: 

1.  It  is  more  demanding  computationally.  There  is  more  overhead  required  to  store 
and  maintain  the  spectral  information.  Furthermore,  the  back-propagation  process 
involves  a  vector  quantity  instead  of  a  scalar  quantity. 

2.  Convergence  can  be  slower.  Training  information  is  not  collected  at  every  time  step 
but  only  after  the  entire  output  sequence  has  been  observed. 

3.  A  steady  state  network  solution  must  exist  for  the  training  to  be  successful.  If 
the  network  output  is  still  on  its  transient  response  or  is  chaotic,  the  spectral 
information  will  be  invalid  and  will  cause  the  adaptation  to  fail. 

4.  Arbitrarily  shaped  sequences  cannot  be  trained  because  of  aliasing  concerns.  The 
discrete-time  Fourier  series  of  the  sequences  must  be  computable  without  any  alias¬ 
ing  problems. 

4  Limit  Cycle  Finite  State  Machine 

4.1  Overview 

A  block  diagram  of  a  limit  cycle  finite  state  machine  (LC-FSM)  is  shown  in  Figure  16.  It 
consists  of  a  memory  with  many  addressable  limit  cycles  (e.g.,  a  SONN)  and  a  controller 
to  govern  the  transitions  from  one  cycle  to  another.  The  cycles  produced  by  the  memory 
correspond  to  the  logical  FSM  states.  The  inputs  and  outputs  to  the  LC-FSM  are 
intended  to  be  continuous- time  limit  cycles  also,  but  fixed- point  inputs  and  outputs  arc 
possible. 

Like  a  traditional  FSM,  the  operation  of  the  LC-FS.M  is  governed  by  a  state  transition 
diagram.  Each  transition  is  determined  by  the  current-state  cycle  from  the  memory, 
the  input  cycle,  and  the  relative  phase  between  them.  Thus,  for  the  same  state  and 
input  cycles,  several  different  state  transitions  are  possible  by  assigning  each  transition 
to  a  different  relative  phase.  The  total  number  of  possible  transitions  is  limited  by 
the  resolution  of  the  transition  detectors,  the  phase-meeisuring  components  of  the  cycle 
transition  controller.  These  detectors  will  be  trained  using  the  SBP  algorithm  to  recognize 
the  desired  transition  conditions. 
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Figure  16:  Block  diagram  of  a  finite  state  machine  based  on  limit  cycles  (LC-FSM). 


Cycle  Select 
(next  state) 


Figure  17:  Expanded  block  diagram  of  the  LC-FSM  memory. 


4.2  Limit  Cycle  Associative  Memory 

The  function  of  the  memory  in  the  LC-FSM  is  to  store  the  current  slate  of  the  machine. 
An  expanded  block  diagram  of  this  component  is  shown  in  Figure  17.  It  has  two  com¬ 
ponents,  a  latch  and  a  self-oscillating  medium.  The  function  of  the  latch  is  to  accept, 
recognize,  and  then  store  the  current  memory  input  if  the  recognition  is  successful.  The 
latch  output  is  a  fixed  point  corresponding  to  the  cycle  to  be  selected.  The  self-oscillating 
medium  produces  a  unique  oscillation  for  a  unique  input.  With  the  output  of  the  latch 
driving  it,  the  self-oscillating  medium  thus  generates  a  unique  limit  cycle  as  the  memory 
output  for  the  most  recently  recognized  memory  input.  Since  the  latch  as  a  simple  storage 
register,  the  main  component  of  the  limit  cycle  associative  memory  is  the  self-oscillating 
medium.  The  SONN  model  presented  in  Section  2  is  useful  in  this  role. 
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4.3  Cycle  Transition  Controller 


A  detailed  block  diagram  of  the  cycle  transition  controller  is  shown  in  Figure  18.  It 
performs  two  tasks:  (1)  to  detect  desired  transitions^and  request  the  next  state  from  the 
memory,  and  (2)  to  generate  the  desired  output  cycles  and/or  vectors. 

The  goal  of  the  first  task  is  to  transform  a  multidimensional  oscillatory  signal  (two 
limit  cycles)  and  produce  a  single  binary-like  signah  indicating  the  input  has  been  “rec¬ 
ognized.”  This  signal  then  can  be  used  to  trigger  a  transition  to  a  new  state.  The 
transformation  should  be  sensitive  to  the  amplitudes  and  phases  of  the  two  limit  cycles 
(i.e.,  their  shapes  and  relative  phase).  This  task  is  the  function  of  the  transition  detectors 
(TDs)  and  the  in-phase  recognizers  (IPRs)  in  Figure  18. 

A  transition  detector  network  is  shown  in  Figure  19.  The  network  accepts  two  cycles 
as  its  inputs,  the  input  cycle  to  the  LC-FSM  and  the  current-state  cycle  from  the  memory 
SONN.  Cells  Dl  and  D2  are  linear  and  respond  immediately,  so  t„  =  0  and  y,-  =  u,-  =  u, 
where  i  is  either  Dl  or  D2  here.  Cell  DO  is  a  constant-output  cell  with  yoo  =  1  ^^d 
provides  a  source  of  trainable  biases  for  the  other  two  cells.  All  the  weights  and  time 
delays  are  trainable,  except  for  the  time  delays  associated;  with  cell  DO  which  are  fixed 
at  0. 

Because  a  transition  detector  contains  only  a  single  layer  of  linear  cells,  the  training 
process  using  the  SBP  algorithm  is  very  quick,  particularly  when  the  SuperSAB  adaptive 
gain  algorithm  is  enabled.  To  learn  a  given  set  of  input  and  current-state  cycles,  these 
cycles  are  presented  to  the  network  at  the  desired  relative  phase  and  the  output  is  trained 
to  be  the  function. 


f  1  r  1  1  .  •  / 

yiW  =J/2(<]  =  2  l+sinf  — 


where  7b  is  the  period  of  both  cycles.  This  signal  generates  a  linear  limit  cycle  as  shown  in 
Figure  20(a)  and  indicates  a  recognized  oscillatory  state.  In  order  for  the  linear  network 
in  Figure  19  to  work,  the  input  and  current  state  cycles  must  have  the  same  period, 
but  this  period  can  vary  from  one  to  another  transition  detector.  Note,  however,  that 
either  the  input  or  the  current-state  (but  not  both)  could  be  constant  and  not  a  cycle. 
Furthermore,  the  phase  of  the  sinusoid  in  Equation  (‘19)  can  be  set  arbitrarily. 

The  second  component  in  the  recognition  process  is  the  in-phase  recognizier,  shown 
in  Figure  21.  This  network  consists  of  two  heterodyne  circuits  in  parallel.  For  both 
circuits,  cells  I1-I4,  BJ-B‘1,  .VIl-M'l,  and  01  and  02  are  linear  (yj  =  v,)  and  respond 
instantaneously  (r„  =  0).  However,  cells  M1-M4  have  a  multiplicative  input  structure 
(as  opposed  to  the  conventional  additive  form)  such  that 


N, 

u,(tl  =  n  Wijyj[t-Tij]  (50) 

where  i  is  one  of  M1-M4  and  j  is  orie  of  either  11-14  or  B1-B4,  whichever  is  appropriate. 
Only  the  interconnects  between  cells  01  and  02  and  M1-M4  are  trainable.  Both  the 
weights  and  the  time  delays  are  allowed  to  be  trained.  Cells  01  and  02  do  n^t  have 
trainable  biases  as  this  would  destroy  the  recognition  properties  of  the  network.  Essen¬ 
tially,  the  two  heterodyne  Orcuits  create  all  possible  cross  products  of  the  input  with 
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Figure  18:  Expanded  block  diagram  of  the  cycle  transition  controller  in  the  LC- 
FSM.  The  cycle  monitor  is  an  optional  component  and  thus  has  dashed  input  and 
output  arrows. 
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Figure  19:  A  transition  detector  network  for  the  LC-FSM.  All  weights  and  time 
delays  are  trainable  here  using  the  SBP  algorithm,  except  for  the  time  delays 
associated  with  the  constant-output  cell.  These  delays  are  fixed  at  Tij  =  0. 


the  input  shifted  by  —0.5.  Using  the  SBP  algorithm,  the  intermediate  outputs  t/i  and 
y2  are  trained  to  be  1  for  all  time  when  the  input  is  the  linear  limit  cycle  described  by 
Equation  (49).  The  SBP  algorithm  effectively  combines  the  cross-products  to  cancel  out 
the  oscillatory  terms  and  leave  only  the  DC  value. 

Cells  Gl,  G2,  and  03  arc  used  to  combine  the  outputs  of  the  two  heterodyne  circuits. 
Cells  Gl  and  G2  have  Gaussian-shaped  output  functions  in  the  form, 

=  (51) 

to  select  the  region  of  the  (i/i,!/?)  space  about  the  point  (1,1).  The  gain  was  set  to 
m  =  50  and  the  cell  filters  were  activated  with  r„  =  20  iterations.  The  finite  bandwidth 
prevents  trajectories  passing  over  (1, 1)  from  being  falsely  recognized  as  being  the  desired 
linear  limit  cycle. 

The  actual  threshold  point  for  a  decision  is  made. by  cell  03  using  the  sum  of  the 
outputs  of  Gl  and  G2.  Cell  03  is  a  thresholding  cell  with  finite  bandwidth  (r„  =  20 
iterations)  to  inhibit  any  output  oscillations  about  the  threshold  point  vq  =  0.8. 

The  final  output  of  the  complete  in-phase  recognizer  (7/3)  is  a  binary  signal  that  is  1 
when  the  input  is  the  linear  limit  cycle  described  by  Equation  (49).  Because  its  function 
is  common  to  all  transition  detectors,  the  in-phase  recognizer  needs  to  be  trained  only 
once  and  then  replicated  for  each  TD.  When  combined  with  a  transition  detector,  the 
TD/IPR  combination  produces  a  1  every  time  a  transition  condition  (i.e.,  the  right  input 
and  current-state  cycles)  is  present. 
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Figure  20:  Possible  outputs  for  a  transition  detector.  Only  the  near-linear  cycle 
shown  in  (a);  is  to  be  considered  a  recognized  state.  Then,  the  input  cycle  and 
the  current-state  cycle  to  t’  is  particular  transition  detector  are  amplitude-  and 
phase- matched,  signifying  a  condition  for  a  state  transition  has  been  recognized. 
The  spatially-distributed  cycle  in  (b)  and  the  chaotic  trajectory  in  (c)  indicate  no 
state  transition  condition  is  present. 


4.4  Next-State  Controller 

The  outputs  of  all  TD/IPRs  go  to  the  next-state  controller  as  shown  in  Figure  18.  This 
network  is  a  simple  vector- to- tor  mapping  network  which  forms  the  correspondence 
between  the  detected  transition  (ondition  and  the  next  state.  Its  function  is  to  realize 
the  state  transition  diagram  of  the  bC-FS.M. 

4.5  Simulated  LC-FSM  Results 

A  LC-FSM  bcised  on  the  SO.N'.N’  presented  in  Section  2  was  simulated  to  test  the  capa¬ 
bilities  of  the  TD/IPR  networks.  .Although  the  SONN  contains  many  cycles,  only  the 
ones  corresponding  to  the  constant  inputs  (a:i,X2)  =  (0,1),  (1,0),  and  (1,1)  are  consid¬ 
ered  here.  The  cycles  generated  by  these  inputs  will  be  referred  to  as  01,  10,  and  11, 
respectively.  This  SONN  will  be  referred  to  as  the  memory  SON.N.  For  simplicity,  the 
output  of  the  LC-FSM  is  simply  taken  to  be  the  current  state  of  the  memory  SONN. 

In. addition  to  the  memory  SONN,  another  identical  SONN  was  used  as  the  input 
to  the  simulated  LC-FSM.  It  is  referred  to  as  the  input  SONN.  Before  going  into  the 
LC-FSM,  the  output  of  the  input  SO.NN  is  sent  through  a  variable-length  time  delay  to 
allow  the  relative  phase  of  the  input  SONN  cycles  to  be  adjusted  manually  with  respect 
to  the  current-state  cycle  from  the  memory  SONN. 
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Figure  21:  An  in-phase  recognizer  network.  The  “-0.5”  inputs  come  from  a 
constant-output  bias  cell  such  as  DO  in  Figure  19.  All  time  delays  are  0  except 
where  labeled  {T  =  1).  These  delays  are  needed  to  compensate  for  the  unit  prop¬ 
agation  delay  through  .cells  B1  and  B2.  Redundant  cells  B3  and  B4  are  shown  for 
clarity  to  illustrate  the  two  parallel  heterodyne  circuits. 
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Figure  22;  State  transition  diagram  for  the  simulated  LC-FSM.  The  (±ff/2)  notation 
denotes  a  relative  phase  shifi  with  respect  to  the  current-state  cycle. 


The  simulated  LC-FSM  implements  the  state  transition  diagram  shown  in  Figure  22. 
The  LC-FSM  states  correspond  to  the  memory  SONN  cycles  and  the  transitions  are 
labeled  with  the  input  oONN  cycles.  Thus,  if  the  memory  SO.N'N  is  oscillating  in  the 
state  01  and  the  inpui  SONN  is  oscillating  in  the  stale  10  at  the  correct  phase,  the  LC- 
FSM  should  make  a  transition  to  the  10  state  in  the  memory  SONN.  The  actual  SONN 
cycles  used  and  their  relative  phases  are  shown  in  Figure  23!a)-(b). 

To  demonstrate  the  phase  sensitivity  of  the  transition  detection,  two  transitions  from 
the  11  state  are  induced  by  the  11  input  cycle  with  two  different  relative  phases.  If  the 
input  cycle  leads  the  current-state  cycle  by  -{-« /2,  the  transition  is  to  the  10  state.  If,  on 
the  other  hand,  the  input  cycle  lags  the  current-state  cycle  by  — ~/2,  the  slate  changes 
to  01.  Since  the  period  of  the  SONN  oscillations  is  about  64  iterations,  a  phase  lead/lag 
of  d-7r/2  equals  a  relative  advance/delay  of  16  iterations  for  the  input  cycle  with  respect 
to  the  current-state  cycle. 

As  an  example  of  the  TD/IPR  networks  in  operation,  consider  the  transition  11 
01.  The  response  of  the  transition  detector  for  this  transition  to  all  8  transition  conditions 
is  shown  in  Figure  23(c).  For  the  corresponding  in-phase  recognizer,  the  intermediate 
output  at  cells  01  and  02  is  shown  in  Figure  23(d).  A  composite  view  of  the  IPR  graphs 
is  shown  in  Figure  24. 

The  discrimination  capabilities  of  the  combination  of  the  TD/IPR  networks  now 
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Figure  23;  The  response  of  the  transition  detector  and  in-phase  recognizer  for 
the  transition  11  01  to  all  8  transition  conditions.  The  end-bar  on  each  cycle 

indicates  the  starting  point  for  the  cycle.  Transition  #5  is  the  one  by  which 
this  transition  is  recognized,  (a)  The  LC-FSM  current-state  cycle,  (b)  The  LC- 
FSM  input  cycle,  (c)  The  output  of  the  11  01  transition  detector,  (d)  The 

intermediate  output  iyi,y2)  of  the  corresponding  in-phase  recognizer  at  cells  01 
and  02.  These  plots  are  typical  of  the  other  transitions.  Continued  on  the  next 
.  page. 
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Figure  24:  Composite  view  of  the  in-phase  recognizer  output  for  the  trancition 
11  -^  01  to  all  8  transition  conditions.  This  plot  is  combination  of  the  graphs 
shown  in  Figure- 23(d).  The  shaded  circular  region  around  (1,1)  is  the  recognition 
region  whereby  a  trajectory  contained  within  indicates  a  transition  condition  is 
present  and  accepted. 


become  apparent.  The  outputs  of  the  TD  mostly  arc  shifted  and  scaled  versions  of  the 
desired  linear  limit  cycle.  However,  the  IPR  separates  these  cycles  into  di;  tinct  parts  of 
the  output  space  denned  by  cells  01  and  02.  R  thus  becomes  an  easy  task  o  decide  if  the 
LC-FSM  input  cycle  and  current-state  cycle  correspond  to  a  valid  transition  condition. 
The  cycle  must  stay  close  to  the  point  (^1,^2)  =  (1, 1).  With  the  parameters  chosen  for 
the  TD/IPR  networks,  the  cycle  recognition  is  done  in  under  3  periods  and  has  a  relative 
phase  sensitivity  of  about  7°. 

This  particular  transition  {^0,  11  01)  was  chosed  because  it  was  the  worst  case 

of  the  8  possible  transitions.  In  the  other  cases,  the  IPR  outpu’s  were  more  spread  out, 
making  the  discrimination  even  easier. 

Given  the  set  of  8  binary  output  signals  from  the  TD/IPR  networks,  the  next-state 
selector  network  was  designed  to  set  and  reset  the  latch  for  the  memory  SONN  such  that 
the  state  transition  diagram  shown  in  Figure  22  was  realized.  The  resulting  network  is 
shown  in  Figure  25  and  its  function  table  is  given  in  Table  I. 

The  latch  network  for  the  memory  SONN  is  simply  a  pair  of  identical  set-reset  (SR) 
flip-flops  and  is  shown  in  Figure  26.  Each  flip-flop  has  two  inputs,  one  for  setting  the 
output  to  be  1  and  the  other  for  resetting  the  output  to  0.  The  latching  property 
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Table  1:  Input-output  relationship  for  the  next-state  selector  network  shown  in 
Figure  25.  The  subscripts  on  the  inputs  correspond  to  the  transition  numbers 
given  in  Figure  23. 
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Table  2:  Function  table  for  the  latch  in  Figure  26. 
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Figure  25;  Next-state  selector  network.  The  weights  are  given  in  Table  1.  There 
are  no  time  delays  here. 

comes  from  the  self-feedback  connection  on  each  cell  (with  a  weight  of  1)  and  the  lateral 
inhibitory  connections.  There  are  no  time  delays  in  any  of  the  interconnects,  nor  is  any 
training  required.  The  cells  respond  instantaneously  (r^  =  0)  and  have  a  binary  (on/off) 
output  function  described  by 

/ 

.  1 

=  < 

0 

where  the  threshold  vq  =  0.9  and  i  is  either  LI.  or  L2  here.  The  resulting  function  table 
for  each  flip  flop  is  given  in  Table  2. 

A  block  diagram  of  the  complete  simulated  LC-FSM  is  shown  in  Figure  27.  It  illus¬ 
trates  the  component  networks  and  their  relative  interconnections.  The  only  addition  is 
an  adjustable  time  delay  Tq  so  the  relative  phase  of  the  input  limit  cycle  with  respect 
to  the  current-state  (memory)  cycle  can  be  varied.  This  ability  is  needed  to  establish 
each  of  the  8  transition  conditions  because  of  the  high  phase  sensitivity  of  the  TD/IPR 
networks. 
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Figure  26:  The  latch  network  for  the  associative  memory  in  the  LC-FSM.  It  consists 
of  two  independent  flip-flop  networks,  one  for  each  input  to  the  memory  SONN. 
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Table  3:  Schedule  for  changing  the  input  cycle  delay  Tq  and  the  resulting  LC- 
FSM  state  changes.  The  transition  reference  numbers  (e.g.,  #4)  are  taken  from 
Figure  23.  The  times  at  which  the  transitions  occurred  are  taken  to  be  when  the 
latch  outputs  changed. 
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NewLC-FS? 

State 

0 

0 

0° 

— 

— ^ 

11 

400 

40 

225° 

593 

#8 

10 

800 

0 

0° 

897 

#4 

11 

1200 

2 

11° 

1310 

#7 

01 

1600 

43 

242° 

1808 

#3 

11 

Before  the  simulation  run,  the  memory  latch  outputs  were  set  to  (yi  =  l,?/2  =  1) 
and  the  memory  SONN  was  run  for  200  iterations  to  allow  it  to  converge  onto  the  11 
cycle.  Then,  the  LC-FSM  was  simulated  for  2400  iterations,  during  which  time  the  input 
SONN  was  set  to  the  11  state.  Only  the  phase  of  this  cycle  with  respect  to  the  memory 
SONN  cycle  was  adjustable  viu  Tq.  The  schedule  on  which  Tjy  was  changed  is  given  in 
Table  3.  The  state  evolution  of  the  LC-FSM  during  this  run  is  shown  in  Figure  28.  For 
reference,  the  times  at  which  To  was  changed  and  when  the  resulting  state  transitions 
occurred  are  shown  with  shaded  vertical  lines. 

In  this  figure,  the  temporal  trace  and  various  state-space  snapshots  are  shown  for  both 
the  input  cycle  to  the  LC-FSM  as  delayed  by  Tq  and  the  LC-FSM  output  as  taken  from 
the  memory  SONN.  Each  state-space  plots  consists  of  the  64  points  before  the  reference 
arrow  above  the  top  (v/i)  trace. 

At  the  bottom  of  the  figure,  the  corresponding  evolutions  of  the  TD/IPR  networks 
along  with  the  memory  latch  outputs  are  shown  to  illustrate  the  timing  relationship 
between  all  the  signals.  The  TD/IPR  trace  actually  is  a  superposition  of  all  8  TD/IPR 
outputs.  Similarly,  the  latch  trace  is  a  superposition  of  both  i/i  and  j/2  latch  outputs. 

Several  observations  can  be  made  from  Figure  28.  First,  the  transitions  occurred 
within  3  periods  of  the  input  limit  cycle  as  predicted  by  the  TD/IPR  discussion.  Next,  the 
TD/IPR  signals  are  usually  very  short  in  duration.  Once  the  latch  outputs  have  changed, 
the  memory  SONN  changes  its  output  cycle,  thus  destroying  the  previous  transition 

condition.  Finally,  the  first  transition  2.1  10  and  the  third  transition  11  - — ^  ^ 

01  show  that  the  same  input  cycle  can  stimulate  different  transitions  based  on  the  relative 
phase  with  respect  to  the  memory  cycle. 
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Figure  27:  Block  diagram  illustrating  the  interconnection  of  the  component  net¬ 
works  in  the  simulated  LC-FSM. 
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Figure  28:  State  progression  of  the  simulated  LC-FSM. 


5  Optical  Implementation  Considerations 

5.1  Optical  SONN  Architecture 

The  SONN  model  discussed  in  Section  2  and  shown  in  Figure  1  on  page  11  was  designed 
with  an  optical  implementation  in  mind.  The  SONN  has  the  following  desirable  features: 

1.  All  cells  have  the  same  nominal  output  characteristic.  SLMs  with  sine-squared  or 
parabolic  output  functions  effectively  match  the  nominal  sigmoid  curve  shown  in 
Figure  4  on  page  13. 

2.  The  off-center,  on-surround  canonical  interconnect  topology  within  the  levels  sim¬ 
plifies  the  interconnect  scheme  and  provides  a  low  diversity  of  interconnects  to 
realize. 

3.  The  model  is  very  tolerant  of  static  parameter  variations  (easily  ±20%).  These 
variations  can  arise  in  several  ways.  First,  the  crosstalk  within  the  interconnects  can 
appear  as  weight  perturbations.  A  nonuniform  readout  beam  effectively  changes 
!/max)  the  maximum  cell  output  value,  over  many  cells.  Similarly,  nonuniformities 
within  the  SLM  can  cause  the  same  effect,  along  with  variations  in  the  gain  m  of 
the  output  function. 

An  optical  architecture  for  the  SONN  depicted  in  Figure  1  is  shown  in  Figure  29.  It 
is  based  on  a  ring  of  optically-addressed  SLMs  (0-SLMs)  connected  via  interconnection 
holograms  (IHs).  The  0-SLMs  implement  the  cell  functions  and  are  sequenced  by  a 
computer  controller.  In  Figure  29,  0-SLMn  contains  the  cells  for  the  n'^  layer  in  each 
level.  For  example,  the  cell  arrangement  for  0-SLMl  is  shown  in  Figure  30.  This  device 
has  the  cells  from  the  first  layer  in  all  three  levels.  Similarly,  tl.3  second  and  third  layers 
of  each  level  are  on  0-SLM2  and  0-SLM3,  respectively. 

The  IHs  are  fixed,  computer-generated  holograms  which  reali,5e  both  the  interlevel  and 
the  intralevel  interconnects.  The  interlevel  connections  arc  made  by  IHl.  Its  interconnect 
mapping  between  the  cells  on  0-SLMl  and  0-SLM3  is  shown  in  Figure  31.  The  off-center, 
on-surround  canonical  interconnect  topology  in  the  levels  is  implemented  by  IH2  and  IH3. 
This  intralevel  mapping  is  shown  in  Figure  32. 

The  input  to  the  SONN  is  obtained  from  an  electrically-addressed  spatial  light  mod¬ 
ulator  (E-SLM)  that  is  driven  by  the  computer  controller.  Hologram  IHl  performs  the 
input  mapping  from  the  input  cells  to  the  first  layer  in  the  first  level.  This  mapping  is 
shown  in  Figure  33  for  the  case  when  the  four  input  cells  in  the  first  level  are  independent 
(in  contrcist  to  Figure  1  where  they  are  derived  from  two  external  input  cells). 

The  SONN  output  is  collected  by  the  controller  using  a  two-dimensional  photode¬ 
tection  device  such  cis  a  camera  or  a  photodetector  array.  This  output  is  derived  from 
the  last  layer  in  the  first  level.  Hologram  IH4  forms  the  corresponding  mapping  from 
0-SLM3  to  the  camera  and  is  illustrated  in  Figure  34. 

This  architecture  implements  the  discrete-time  approximation  of  the  continuous- time 
network  equations  used  in  Section  3  with  no  time  delays  in  the  interconnects.  In  order  to 
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Interlayer  First  layer 

connections  in  all  levels 


Figure  29:  Optical  architecture  for  the  SONN  shown  in  Figure  1.  The  thin  lines 
between  the  various  components  are  light  beams. 
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Figure  30:  Cell  layout  for  O-SLMl  in  Figure  29  with  respect  to  the  SONN  in 
Figure  1. 


realize  time  delays,  one  way  is  to  store  the  most  recent  N  outputs  (y,[<],  yt[i— 2), 

, . . ,  y,[i— A^+1])  on  the  0-SLMs  and  have  a  shift  mechanism  update  the  outputs  at  each 
time  step.  This  method  only  provides  integral  time  delays  so  no  interpolation  in  time 
is  required  (nor  is  it  possible  to  do  conveniently).  In  addition,  the  optical  devices  must 
have  larger  space-bandwidth  products.  For  example,  if  =  3  so  the  values  y,[i],  y,[l— 1], 
and  y,[i-2]  are  available,  the  1  x  3  cell  arrangement  shown  on  the  left  side  of  Figure  30 
would  become  the  4x9  layout  shown  in  Figure  35.  The  mappings  performed  by  the 
interconnect  holograms  would  have  to  be  modified  slightly  to  use  the  desired  (delayed) 
output. 

If  the  0-SLMs  do  not  have  internal  shift  capabilities,  the  optical  shift  technique  shown 
in  Figure  36  can  be  used.  This  figure  depicts  an  0-SLM  from  Figure  29  with  the  shift 
path.  The  key  aspect  of  the  shift  path  is  the  one-pi.xel  offset  in  the  positioning  of  the  left 
mirror  (M)  and  beamsplitter  (BS).  This  offset  causes  the  0-SLM  output  y,[t]  to  be  fed 
back  into  the  input  position  corresponding  to  y,(f— 1).  In  order  to  work,  this  technique 
requires  that  the  0-SLM  output  does  not  change  directly  with  the  input.  In  other  words, 
the  input  light  beams  must  be  detected  and  measured  first  and  then  the  output  beam 
updated. 

There  are  two  other  issues  which  must  be  addressed  for  the  optical  SONN  architecture; 
implementing  mixture  of  excitation  and  inhibition  with  an  SLM  and  realizing  finite  cell 
bandwidth.  The  first  issue  stems  from  the  fact  that  the  input  side  of  current  SLMs  are 
square-law  detectors  and  thus  measure  only  the  intensity  of  the  incident  light.  Since  the 
intensity  is  always  a  positive  quantity,  it  is  difficult  to  realize  an  inhibitory  input  signal 
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IH1  input  Plane 
(Output  of  0-SLM3) 


IH1  Output  Plane 
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Figure  31:  Interlevel  connection  mapping  for  IHl  from  the  cells  on  0-SLM3  to 
those  on  O-SLMl. 
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IH2  Input  Plane  IH2  Output  Plane 
(Output  of  0-SLM1)  (Input  to  0-SLM2) 


Key:  -  Excitatory  interconnect 

mmmmmmm  Inhibitory  interconnect 

Figure  32;  Intralevel  connection  mapping  for  IH2  from  the  cells  on  O-SLMl  to 
those  on  0-SLM2.  These  interconnects  realize  the  off-center,  on-surround  canonical 
topology.  The  mapping  for  IH3  from  0-SLM2  to  0-SLM3  is  the  same  as  IH2. 
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Figure  33:  SONN  input  connection  mapping  for  IHl  from  the  E-SLM  to  the  cells 
on  0-SLMl. 
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Figure  34:  SONN  output  connection  mapping  for  IH4  from  the  cells  on  0-SLM3 
to  the  camera. 
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Original  0-SLM/i 
Cell  Output 
Distribution 


0-SLM/t  Cell 
Output  Distribution 
With  Recent  Outputs 
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Figure  35;  Cell  output  distribution  on  the  0-SLMs  when  recent  outputs  are  re^ 
tained.  The  arrows  between  the  cells  in  the  right  diagram  indicate  the  shift  mech' 
anism  at  each  time  step. 


to  a  cell  on  an  SLM  when  the  neighboring  cell  may  have  an  excitatory  input  signal. 
Previous  methods  such  as  implementing  only  inhibitory  neurons  [3]  cannot  be  used  here. 
The  SONN  model  tries  to  minimize  this  problem  by  having  only  one  inhibitory  input  to 
each  cell.  Unfortunately,  this  approach  does  not  solve  the  problem. 

The  second  issue  addresses  the  sum  filters  in  the  cells.  These  filters  are  necessary  for 
generating  piecewise-continuous  (i.e.,  smooth  in  a  discrete-time  sense)  cycles.  The  filters 
are  optional  if  discontinuous  limit  cycles  are  acceptable,  but  this  is  not  the  case  for  the 
simulated  LC-FSM  because  the  TD/IPR  networks  require  continuous  limit  cycles  for  the 
SBP  training  algorithm  to  work.  Under  exposing  the  SLMs  while  writing  them  offers  an 
approximation  to  the  low-pass  filter,  but  only  until  the  SLMs  need  to  be  erased  to  avoid 
saturation. 

With  these  issues  in  mind,  we  have  conceived  a  hybrid  optical-VLSI  SLM  which  solves 
the  inhibition  and  filter  problems.  In  addition,  it  offers  an  elegant  solution  for  realizing 
time  delays.  A  cut-away  view  of  the  proposed  SLM  is  shown  in  Figure  37.  Its  functional 
block  diagram  is  shown  in  Figure  38. 

The  SONN  SLM  implements  the  cell  functions  electronically  and  relies  on  optics  to 
do  the  interconnections.  Unlike  previous  optical-VLSI  devices  [4-9],  this  device  relies  on 
three-dimensional  integration  technique  to  create  an  input  side  and  an  output  side.  The 
input  side  contains  two  photodetectors,  one  for  the  excitatory  light  and  the  other  for  the 
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Input  ' ' 
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Figure  37:  Optical- VLSI  SLM  for  an  optical  SONN  implementation.  Each  cell  has 
the  functional  structure  shown  in  Figure  38.  Only  three  of  the  four  cells  in  a  layer 
as  shown  in  Figure  35  are  illustrated  here. 


inhibitory  light.  These  photodetcclors  are  connected  througli  the  bulk  substrate  of  the 
device  via  vertically  integrated  wires  to  the  output  side  which  contains  all  the  processing 
electronics.  Thus,  the  three-dimensional  integration  needs  to  provide  only  a  method  for 
connecting  one  plane  with  another  plane  with  a  few  writes. 

A  difference  amplifier  performs  the  necessary  subtraction  to  compute  the  total 
weighted  cell  input,  The  cell  filler  follows  and  is  shown  as  a  simple  RC  low-pass 
filter.  An  electronically  controllable  switch  selects  either  the  filtered  form  of  the  weighted 
sum  or  the  weighted  sum  itself  to  be  f,(t].  In  this  way.  the  filter  can  be  cither  activated 
for  continuous-time  cycles  or  deactivated  for  discrete-time  cyclo. 

Finally,  is  loaded  into  an  analog  latch  at  the  ne.xt  clock  pulse.  Also  with  this 
clock  pulse,  the  outputs  of  all  the  latches  are  loaded  into  the  next  respective  latch  in  the 
shift  path.  Once  the  latches  have  been  loaded,  their  outputs  are  used  to  drive  individual 


Figure  38:  Block  diagram  of  an  electronic  cell  in  the  optical- VLSI  SLM  for  the 
optical  SONN  architecture. 
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optical  light  sources  such  as  laser  diodes  or  LEDs  via  buffer  amplifiers.  The  final  output 
of  the  cell  is  a  set  of  optical  beams  whose  intensities  are  proportional  to  j/[f— 1],  y[t—2], 

. . . ,  t/[i  —  +  1],  respectively  (A^  =  3  in  Figure  38).  All  beams  are  on  simultaneously. 

Their  uniqueness  is  maintained^by  their  spatial  separation.  If  time  delays  are  not  desired, 
the  shift  path  can  be  eliminated  with  the  exception  of  the  first  analog  latch. 

The  main  disadvantages  with  the  SONN  SLM  are  mechanical  support  and  thermal 
cooling.  Since  both  the  top  and  bottom  surfaces  are  active,  the  substrate  can  be  mounted 
only  on  its  sides.  The  substrate  must  be  thick  enough  to  provide  sufficient  rigidity  and 
durability.  Fortunately,  the  thickness  of  the  substrate  is  not  constrained  by  the  electronics 
because  the  substrate  only  contains  wires.  As  for  the  cooling  issue,  the  cells  in  the  center 
of  the  device  have  limited  access  to  a  heat  sink.  This  access  is  necessary  because  the 
laser  diodes  consume  milliwatts  of  power  as  opposed  to  microwatts  for  all  the  other 
electronics.^ 

Admittedly,  a  SONN  SLM  with  all  components  mounted  on  one  side  would  have 
immediate  access  to  a  heat  sink  and  would  be  easier  to  fabricate.  However,  it  also  would 
complicate  the  interconnection  scheme  because  the  input  and  output  light  paths  would 
conflict.  If  this  conflict  could  be  resolved  through  a  redesign  of  the  interconnection 
holograms,  the  one-sided  fabrication  offers  superior  mechanical  support,  thermal  cooling 
properties,  and  well- developed  manufacturing  techniques. 

5,2  Computer-Generated  Hologram  Development 

In  addition  to  the  SLMs,  the  interconnection  holograms  (IHs)  form  the  other  major  com¬ 
ponent  of  the  optical  SONN  architecture.  Considering  the  practical  optical  demands 
of  the  SONN  architecture,  general,  flexible,  and  efficient  IHs  were  sought.  A  computer 
generated  approach  seemed  the  best  suited  for  the  requirements,  but  after  a  review  com¬ 
puter  generated  hologram  (CGH)  fabrication  techniques,  we  were  motivated  to  develop 
a  CGH  process  to  create  IHs  possessing  high  efficiencies,  generality,  low  processing  costs, 
and  expedient  scheduling.  Current  high  resolution  color  printer  technology  was  used  as 
a  mechanism  for  creating  multiple  discrete  phase  levels  in  bleach  processed  silver  halide 
photographic  film.  This  technicnie  allowed  for  the  creation  of  arbitrary  Ills  possessing 
the  necessary  high  efficiencies. 

Several  techniques  for  crealing  CGHs  have  employed  either  binary  or  multiple  step 
representation  for  amplitude-only,  phase-only,  or  amplitude  and  phase  CGHs.  The 
Lohmann  and  Lee  1974  binary  amplitude-only  holograms  encode  amplitude  and  phase 
information  by  defining  a  blocking  area  and  its  displacement  within  a  CGH  cell,  but  of 
the  many  binary  amplitude-only  encoding  techniques  [12],  these  methods  possess  the 
highest  diffraction  efficiencies  of  0.1  -0.2%.  These  techniques  are  easily  implemented  with 

^One  possible  solution  is  to  place  a  glass  covering  over  all  the  laser  diodes.  The  thermal  conductivity 
of  glass  is  an  order  of  magnitude  larger  than  that  of  air  and  thus  would  act  as  a  better  albeit  transparent 
heat  sink.  Another  possible  solution  is  to  make  the  substrate  thick  enough  to  allow  a  miniature  cooling 
system  to  be  incorporated  into  it.  Alternatively,  the  laser  diodes  could  be  removed  entirely  from  the 
SONN  SLM  and  could  be  replaced  by  a  reflective  electro-optic  material  such  as  a  miniature  liquid  crystal 
display.  In  this  case,  the  latches  in  the  shift  path  in  Figure  38  simply  drive  a  high-impedance  capacitive 
load  which  requires  very  little  power.  The  SONN  SLM  then  would,  have  to  be  read  out  using  an  external 
laser  with  an  expanded  beam. 
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a  laser  printer  and  pho.toreduction  system  but  result  with  very  poor  optical  efficiencies. 
Multiple  step  amplitude-only  CGHs  encode  phase  and  amplitude  information  by  discrete 
approximating  the  continuous  gray  level  spectrum  found  when  holograms  are  recorded 
optically,  but  this  technique  requires  a  specialized  gray  level  photoplotter  and  yields 
marginal  efficiency  considering  the  amplitude-only  nature. 

A  natural  extension  for  efficiency  improvement  is  phase-only  encoding  to  eliminate 
absorptive  loss.  Binary  phase  approaches  include  binary  Dammann  gratings  [13]  which 
provide  controlled  amplitude  and  phase  relations  in  1-D  or  uncoupled  2-D  or  direct  grat¬ 
ings  for  Fresnel  lens  implementation  [14].  Dammann  gratings  provide  efficiencies  up  to 
about  75%  but  require  very  accurate  phase  transition  placement  and  are  impractical  for 
coupled  2-D  functions.  Binary  Fresnel  lens  implementations  are  limited  to  a  maximum 
41%  efficiency.  To  improve  these  efficiencies,  multiple  phase  level  approaches  including 
the  kinoform  and  blazed  surface  relief  elements  have  been  considered  [15,  16].  These 
techniques  represent  the  highest  efficiencies  achievable,  but  either  require  gray  level  pho¬ 
toplotting  or  capital  intensive  processing  equipment  for  glass  etching. 

Each  of  the  above  processes  have  their  own  merit,  but  high  efficiency,  low  cost,  flex¬ 
ible,  and  expedient  processed  CGHs  were  required  for  the  SONN  architecture.  Our 
technique  creates  multiple  phase  levels  in  bleach  processed  black  and  white  photographic 
film  by  modulating  film  exposure  with  colors  possessing  differing  spectral  transmissions. 
Figure  39  illustrates  the  entire  process  with  cartoon  style  flow  diagram.  A  convention 
IBM  compatible  personal  computer  was  used,  but  any  computer  is  possible  provided 
the  PostScript,  a  powerful  illustration/documentation  printer  programming  language,  is 
the  ultimate  file  format.  The  PostScript  program  contains  all  necessary  commands  to 
generate  the  required  color  placement  for  CGH  and  IH  implementation.  The  resultant 
color  mask  was  photoreduced  with  standard  high  resolution  camera  equipment  on  high 
resolution  black  and  white  film,  and  photochemical  processing  developed  and  st'abilized 
the  phase-only  CGH  image. 

Current  technology  has  allowed  for  the  availablity  of  inexpensive  color  printers  that 
offer  high  resolution  output.  Of  the  many  possible  choices,  we  selected  the  QMS  Col- 
orScript  iOO  Model  10  for  the  quality  of  output  and  good  price/performance  ratio.  This 
printer  cost  ~  S8,200  and  offered  300  dots  per  inch  resolution  with  a  total  of  eight  solid 
colors.  Output  from  the  printer  was  taken  on  clear  transparency  film  for  backlite  illu¬ 
mination  during  the  photoreduction  process.  We  constructed  a  vibration  resistant  high 
resolution  backlite  copy  stand  for  30X  photoreduction  with  off-the-shelf  high  quality  opti¬ 
cal  components  (camera,  lens,  and  white  sources).  The  input  image  plane  possessed  5  to 
10%  illumination  uniformity,  and  the  photoreduced  results  were  capable  of  approximately 
200  line  pairs  per  mm  at  an  optical  density  of  3.0. 

The  format  of  the  camera  was  4x5  inches,  and  we  used  high  resolution  black  and 
white  film  of  that  dimension.  The  selection  of  the  film  was  extremely  important  for 
the  distribution  of  color  ex  posed /recorded  optical  densities  (in  turn,  optical  phases). 
Each  color  acts  as  a  spectral  transmission  filter  to  the  backlite  white  light  source,  and 
thus  narrowing  the  spectral  power  density  impinging  onto  the  black  and  white  film. 
The  monochrome  equivalent  (luminance)  of  the  printer  colors  yields  a  relatively  linearly 
stepped  growing  relation  [17].  To  maintain  this  relation,  a  film  with  a  relatively  flat 
spectral  sensitivity  was  necessary  so  that  linearily  distributed  optical  densities  would 
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Figure  39:  Flow  diagram  of  the  color  CGH  process. 


Figure  40:  Nominal  exposure  color  to  density  mapping  for  Kodak  649F. 


result.  Kodak  649F  on  a  4x5  inch  glass  substrate  was  selected,  and  the  cmperical 
mapping  of  color  to  optical  density  is  illustrated  in  Figure  40.  For  Kodak  649F,  we 
observed  repeatable  mapping  for  film  plates  within  the  same  emulsion  batch  given  uniform 
processing  conditions,  but  differing  emulsions  exhibited  different  photographic  speeds 
requiring  exposure  characterization  for  each  emulsion.  The  chemical  processing  times 
and  conditions  were  optimized  for  the  desired  color  mapping  as  illustrated  [17]. 

Discrete  phcise  levels  were  created  by  modifying  the  developing  process  to  include 
bleach.  A  reversal  bleach  was  selected  for  removing  the  pigment  of  the  latent  image  by 
eliminating  all  of  the  exposed  grains.  This  process  yields  less  phase,  or  retarded  phase,  in 
the  emulsion  regions  that  experienced  higher  net  exposures.  The  emperical  color  to  phase 
mapping  is  illustrated  in  Figure  41.  We  measured  the  pha.se  changes  with  a  modified  film 
substrate  grading  technique  we  developed.  A  slight  angle  wedge  (imperfect  microscope 
slide)  was  sandwiched  to  the  film  plate  being  measured  with  index  matching  fluid,  and 
this  wedge  provided  closely  spaced,  approximately  straight  fringes  when  the  sandwich 
was  read  out  in  reflection  with  a  colimated  laser  source.  The  discontinunity  of  the  fringes 
due  to  differently  (color)  exposed  regions  was  measured.  By  organizing  the  colors  by 
growing  luminances,  the  ambiguity  in  phase  measurement  was  eliminated. 

The  major  limiting  factor  for  our  color  to  phase  mapping  process  was  the  photore- 
duced  rolloff  affected  by  printer  anomalies  and  camera  diffraction  limitations.  We  ob¬ 
served  poor  print  quality  at  resolutions  (one  to  three  printer  pixels  per  color)  near  the 
resolution  limit  of  the  printer  we  selected.  The  modulation  transfer  function  (MTF)  for 
30X  reduced  color  gratings  was  measured  for  optical  density  and  illustrated  in  Figure  42, 
and  by  optical  density  to  phase  correlation,  the  phase  rolls  off  in  the  same  fashion.  This 
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rolloff  issue  was  observed  to  affect  the  diffraction  efficiency  of  the  example  applications. 

As  the  first  application,  eight  phaise  level  approximated  Fresnel  lens  were  imple¬ 
mented.  The  Fresnel  lens  diffraction  efficiency  was  modelled  for  eight  phase  levels  with 
respect  to  the  non-uniform  phase  step  height  and/or  widths  [17).  A  maximum  diffrac¬ 
tion  efficiency  of  ~  86%  was  determined  (excluding  insertion  losses)  for  a  given  relative 
density  distribution  of  eight  colors  with  uniform  spacing.  A  sample  side  on  view  (optical 
density  -  reverse  contrast)  of  an  example  Fresnel  lens  implemented  with  color  mapping  is 
shown  in  Figure  43,  and  bleacliing  removes  the  reversed  contrast.  This  figure  also  illus¬ 
trates  how  the  outer  Fresnel  zones  (smaller  feature  sizes)  were  rolled  off  which  resulted 
in  a  non-uniform  phase  modulation  depth  over  the  extent  of  the  lens  effectively  reducing 
diffraction  efficiency. 

We  constructed  a  variety  of  single  and  compound  Fresnel  lenslet  arrays  possessing 
differing  focal  lengths  and  aperture  sizes.  An  example  of  two  lenslet  arrays  is  illustrated  in 
Figure  44.  We  observed  a  peak  measured  efficiency  of  67.2%  for  eight  evenly  spaced  colors 
(excluding  insertion  loss).  Rolloff  into  the  outer  Fresnel  zones  retarded  the  maximum 
diffraction  efficiency  from  the  86%  predicted  peak.  The  highest  efficiency  30  cm  lens 
exhibited  a  130  jivn.  spot  size  corresponding  to  a  Gaussian  beam  predicted  size  of  117.4 
HTCi  given  the  lens  focal  length  and  dianletef  of  2.4  mm.  An  insertion  loss  of  26.3% 
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Figure  42:  Modulation  transfer  function  of  the  SOX  photoreduced  printer  colors  as 
the  feature  size  decreases  (in  printer  resolution  units). 


due  to  front  and  back  surface  Fresnel  reflection,  scattering,  and  a  high  absorption  losses 
was  experimentally  measured.  A  large  absorption  loss  was  due  to  staining  by  the  bleach 
used,  but  other  reversal  bleaches  may  possess  lower  insertion  loss  attributes.  By  exposing 
lenses  within  an  array  differently,  we  were  able  to  produce  lenses  of  varying  diffraction 
efficiencies  which  allowed  u.s  to  quickly  characterize  different  film  emulsions  for  maximum 
diffraction  efficiency  with  only  one  film  plate. 

Ultimately,  we  implemented  optical  interconneclions  that  offered  a  completely  arbi¬ 
trary  nature  with  high  opliial  power  efficiency.  Aibitrai}  inlcrconnections  can  be  cre¬ 
ated  with  discrete  pha,se  level  approximated  diffractive  blaze  gratings  (synthetic  blazed 
gratings)  via  a  computer  ami  this  process.  Figure  1.5  illu.slrates  how  the  grating  were 
constructed  with  color  to  j)hasc  mapping.  The  blazed  grating  was  used  in  transmission, 
and  off-axis  diversion  or  elevation,  0,  was  specifing  with  the  period  of  the  grating.  Phase 
depth  of  the  grating  is  set  to  2rr  or  one  wavelength,  A,  and  thus  the  off  axis  diversion  is 
specified  by 

n  A 
tan  0  =  — 

A 

where  A  is  the  period  of  the  grating.  For  the  interconnection  examples  we  addressed,  the 
vector  relative,  x  and  y,  displacements  from  the  input  plane  to  the  output  plane  defines 
the  grating  period  as  follows 

A  _  _2^L_ 

\/x  '^  -f  y  '^ 

-where  d  is  the  throw  distance  between  the  input  plane  and  the  output  plane.  The  grating 
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Figure  43:  Optical  density  profile  of  a  30  cm  Fresnel  lens. 


lines  are  rotated  by 

<f>  =  tan”*  — 

y 

within  the  input  connection  cell  to  provide  azimuth  control. 

With  vhis  interconnection  methodology,  we  implemented  an  arbitrary  interconnection 
with  a  5x5  input  plane  and  spelled  out  “MIT”  in  the  output  plane.  OfF-axis  and  on- 
axis  cases  were  considered  with  emphasis  placed  on  the  on-axis  case.  The  input-output 
mapping  for  the  on-axis  case  is  illustrated  in  Figure  46.  Using  PostScript,  the  color 
mask(s)  necessary  for  the  IH  was  generated  and  photoreduced  with  an  exposure  level  to 
optimize  diffraction  efficiency.  Figure  47  illustrates  a  photograph  of  the  output  plane, 
and  threshold  functionality  of  photographic  film  effectively  eliminated  anomalous  on  or 
off-axis  noise. 

For  the  on-axis  case,  the  average  diffraction  efficiency  was  54.2%  with  a  peak  efficiency 
of  86.0%  at  the  lower  right  corner  of  the  M.  The  average  contreist  ratio  was  11.5:1  with  a 
best  case  value  of  18.2:1  and  worst  case  value  of  7.2:1  calculated  against  the  average  noise 
power.  The  off-axis  MIT  interconnection  exhibited  similar  efficiency,  but  it  possessed  a 
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Figure  44:  Focal  plane  photograph  of  (a)  a  5x5  30  cm  lenslet  array  and  (b)  a  15x15 
5  cm  lenslet  array. 


higher  average  contrast  ratio  of  17.4:1  with  a  22.6:1  best  case  and  13.6:1  worst  case 
because  the  interconnection  was  directed  out  of  the  on-axis  power  region.  Given  the 
rolloff  and  diffraction  efficiency  as  a  function  of  resolution,  off-axis  deflection,  0,  of  0.6° 
with  at  least  50%  efficiency  was  realizable  which  corresponds  to  a  7.5  fim  minimum 
feature  size.  Deflection  of  up  to  0.78°  with  reduced  efficiency  was  possible  before  printer 
problems  become  insurmountable. 


Figure  45:  Construction  of  a  synthetic  blazed  grating  with  the  color  to  phase 
mapping  CGH  process. 
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Figure  47;  Experimental  construction  of  an  on-axis  MIT  interconnection. 

Considering  these  numbers,  this  color  CGH  process  has  provided  an  excellent  mech¬ 
anism  for  producing  on-axis  IHs  with  efficiencies  >  50%  and  good  contrast  ratios.  Other 
interconnection  examples  including  the  optical  folded  perfect  shuffle  and  the  sub-element 
interconnection  array  for  arbitrary  fan-out  (9  total  connections)  of  input  cells  were  im¬ 
plemented  with  similar  successes  [17]. 

5.3  ,  Computer-Controlled  SLM  Characterization 

We  also  developed  a  computer  controlled  system  to  quickly  evaluate  the  limitations  of 
spatial  light  modulators  (SLMs)  used  within  our  optical  systems.  The  parameters  con¬ 
trolled  and  measured  included  framing  rates,  framing  dynamics,  contrast  ratios,  MTFs, 
optical  sensitivities,  exposure  dynamics,  output  uniformity,  and  input  imaging.  A  PC 
controlled  data  acquistion  system  with  digital  to  analog  output  was  setup  with  serial 


controlled  linear  translation  and  rotation  stages.  Optical  powers  were  measured  with  a 
high  sensitivity  autoscaled  power  meter  interfaced  to  the  central  PC  with  an  IEEE-488 
connection. 

With  a  computer  controlled  Michelson  interferometer,  we  were  able  to  create  arbitrary 
high  spatial  frequency  fringes  and-  record  them  onto  a  microchannel  spatial  light  modu¬ 
lator  (MSLM),  an  0-StM,  under  test.  Using  these  fringes,  we  mechanically  scanned  the 
magnified  MSLM  output  image  with  a  fine  razor  blade  apertured  high  gain  photodetec¬ 
tor,  and  under  computer  control,  we  collected  all  the  necessary  data  to  compile  a  MTF 
curve  for  the  device  under  test.  By  recording  a  uniform  input  exposure,  we  scanned  the 
X  and  y  output  plane  with  pinhole  mounted  to  the  high  sensitivity  optical  power  meter 
detector  to  determine  the  device  output  uniformity.  The  control  software  was  written  in 
a  fashion  to  maximize  flexiblity  and  reconfigurability  dependant  upon  application. 

6  Conclusions 

The  original  goal  of  the  program  was  to  develop  hybrid  optical  inference  machines.  In¬ 
stead  of  expanding  upon  the  conventional  approaches  (based  on  nonlinear  matrix- vector 
multipliers),  we  opted  to  look  at  a  different  method.  Because  of  problems  with  fault  tol¬ 
erance  in  the  conventional  approaches,  we  chose  to  consider  encoding  information  using 
limit  cycles.  This  new  approach  created  a  whole  new  set  of  challenges  and  problems  to 
overcome.  During  the  program,  we  solved  many  of  the  problems  associated  with  process¬ 
ing  with  limit  cycles,  but  some  issues  remain  unsolved  with  respect  to  a  practical  optical 
implementation. 

Given  the  unfamiliarity  of  processing  with  limit  cycles,  we  chose  to  concentrate  on  a 
simplified  form  of  symbolic  processing,  the  finite  state  machine  (FSM).  The  limit  cycle 
form  of  this  computation  paradigm  (LC-FSM)  requires  (1)  a  medium  that  supports 
many  cycles  and  (2)  a  method  for  establishing  couplings  between  cycles.  The  first  task 
corresponds  to  an  associative  memory  for  limit  cycles  and  the  second  one  is  a  controller 
for  switching  between  cycles.  Because  of  their  flexibility,  neural  networks  were  chosen  as 
the  working  medium. 

In  the  program,  we  created  the  self-oscillating  neural  network  (SONN)  model  for 
solving  the  first  task.  This  model  was  designed  with  an  optical  implementation  in  mind. 
It  is  very  tolerant  of  static  variations  in  the  network  parameters  (easily  in  excess  of 
±20%).  In  an  optical  implementation,  these  variations  appear  as  nonuniformities  in 
the  spatial  light  modulators  (SLMs),  the  readout  light,  and  crosstalk  in  the  holographic 
interconnections.  The  SON.N  is  well  suited  to  the  first  task  because  it  has  many  cycles 
available  with  no  training  or  programming  required.  They  can  be  selected  by  either 
constant  or  cyclical  inputs.  The  SONN  is  an  example  of  a  component  which  will  work 
as  an  associative  memory  for  limit  cycles  in  a  LC-FSM. 

For  the  second  task,  we  derived  the  spectral  back-propagation  (SBP)  training  algo¬ 
rithm  for  creating  a  LC-FSM  controller  for  limit  cycles.  This  algorithm  is  an  extension  of 
the  conventional  back- propagation  algorithm  to  train  input-output  sequences.  The  SBP 
algorithm  uses  discrepancies  in  the  Fourier  series  spectra  of  the  output  sequences  as  an 
error  criterion.  This  approach  allows  not  only  the  weights  but  also  the  time  delays  associ- 
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ated  with  the  interconnects  to  be  trained.  Furthermore,  the  cells  in  the  network  can  have 
finite  bandwidth  via  a  first-order  low-pass  filter.  We  have  demonstra,ted  the  algorithm 
successfully  on  both  feedforward  and  recurrent  networks  with  both  continuous-time  and 
discrete-time  sequences. 

In  a  simulated  3-state  LC-FSM  with  8  possible  transitions,  the  SBP  algorithm  allowed 
us  to  develop  a  very  simple  set  of  networks  (the  transition  detector  and  the  in-phcise 
recognizer,  TD/IPR)  for  recognizing  a  particular  multidimensional  limit  cycle.  These 
networks  can  do  the  recognition  in  under  3  periods  with  a  very  high  amount  of  phase 
sensitivity  (7°).  They  permit  the  same  input  and  current-state  cycle  of  the  LC-FSM 
to  stimulate  different  transitions  depending  upon  their  relative  phase.  Furthermore,  the 
portions  of  the  TD/IPRs  whose  weights  and  time  delays  are  trained  by  the  SBP  algorithm 
are  linear  networks  with  no  hidden  cells.  Thus,  the  TD/IPRs  can  be  trained  very  quickly. 
These  networks  form  the  key  component  of  the  LC-FSM  controller.  They  generateLinary 
signals,  each  of  which  corresponds  to  a  unique  transition  condition  in  the  LC-FSM.  The 
transitions  then  can  be  made  using  a  simple  mapping  network.  The  SBP  algorithm  thus 
is  useful  tool  for  creating  a  LC-FSM.  Using  the  SONN  and  the  SBP  algorithm,  we  were 
able  to  demonstrate  successfully  a  working  LC-FSM  using  computer  simulations. 

The  problems  left  unanswered  by  the  program  are  associated  with  creating  a  practi¬ 
cal  optical  LC-FSM,  a  intermediate  but  necessary  milestone  towards  making  a  working 
hybrid  optical  inference  machine.  We  designed  an  optical  architecture  for  the  SONN  and 
developed  a  novel  technique  for  making  the  required  interconnection  holograms  using 
conventional  color  printer  technology.  This  IH  foundry  provided  efficiencies  exceeding 
50%  expediently,  flexibly,  and  at  a  low  cost.  We  demostrated  arbitrarily  defined  IHs 
with  the  color  CGH  process.  However,  limitations  with  current  SLMs  motivated  us  to 
propose  a  new  general-purpose  hybrid  optical- VLSI  SLM.  Development  of  this  type  of 
device  would  greatly  benefit  the  research  into  hybrid  optical  inference  machines,  partic¬ 
ularly  those  based  on  limit  cycles. 
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Hybrid  optical  inference  machines:  architectural 
considerations 


Cardinal  Warde  and  James  Kottas 


A  class  of  optical  computing  systems  is  introduced  for  solving  symbolic  logic  problems  that  are  characterized 
by  a  set  of  data  objects  and  a  set  of  relationships  describing  the  data  objects.  The  data  objects  and 
relationships  are  arranged  into  sets  of  facts  and  rules  to  form  a  knowledge  base.  The  solutions  to  symbolic 
logic  problems  involve  inferring  conclusions  to  queries  by  applying  logical  inference  to  the  facts  and  rules. 
The  general  structure  of  an  inference  machine  is  discussed  in  terms  of  rule-driven  and  query-driven  control 
flows.  As  examples  of  a  query-driven  inference  machine,  two  hybrid  optical  system  architectures  are 
presented  which  use  matched-filter  and  mapped-template  logic,  respectively. 


I.  Introduction 

A.  Definitions 

Symbolic  logic  problems  involve,  in  an  abstract 
sense,  a  set  of  data  objects  and  a  set  of  relationships 
describing  the  data  objects,  The  data  objects  and 
relationships  constitute  a  knowledge  base  which  is 
generally  arranged  as  sets  of  facts  and  rules.  A  fact  is  a 
statement  connecting  a  relationship  with  one  or  more 
data  objects  so  that  the  statement  is  always  interpret¬ 
ed  as  true.  On  the  other  hand,  a  rule  is  a  statement 
which  defines  a  relationship  using  other  relationships, 
data  objects,  and/or  facts. 

A  symbolic  logic  problem  is  usually  stated  in  the 
form  of  one  or  more  queries  which  are  questions  con¬ 
cerning  relationships  and  data  objects.  The  queries 
are  answered  by  applying  logical  inference  to  the 
knowledge  base  of  rules  and  facts.  This  inference 
process  generates  a  set  of  assertions  (inferred  facts) 
from  the  knowledge  base.  The  solution  to  the  queries, 
therefore,  beconies  a  set  of  conclusions  in  the  form  of 
data  objects,  which  is  inferred  from  the  set  of  asser¬ 
tions  so  as  to  satisfy  the  queries. 

B.  PROLOG 

Symbolic  logic  problems  are  relatively  common. 
They  arise  in  areas  such  as  expert  systems  and  other 
artificial  intelligence  applications.  In  recent  years, 
the  computer  science  language  PROLOG  has  become  a 
tool  for  solving  these  types  of  problem  on  electronic 
computers.'  For  example,  two  goals  of  fifth-genera- 
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tion  computers  are  ( 1)  to  develop  a  machine  capable  of 
logical  inference  and  data  base  operations  and  (2)  to 
design  a  language  based  on  PROLOG  that  would  be 
suitable  for  inferring  and  representing  knowledge.'^ 

To  solve  a  query,  electronic  PROLOG  sequentially 
searches  for  the  knowledge  base  for  the  appropriate 
rules  and  facts.  This  search  process  uses  a  flexible 
pattern-matching  technique  called  unification  which 
involves  searching,  matching,  and  backtracking 
through  the  knowledge  base.^'*  The  performance-of 
electronic  PROLOG  is  limited  by  its  use  of  serial  search¬ 
ing  and  backtracking.  Paralog,  an  implementation 
of  PROLOG  which  uses  parallel  unification,  addresses 
this  issue  and  is  currently  under  investigation;- 


C.  Role  of  Optics 

It  is  well  known  that  2-D  parallel  optical  processors 
inherently  perform  high-speed  pattern  matching 
Such  systems  should,  therefore,  be  more  efficient  at 
searching  than  their  serial  electronic  counterparts  be 
cause  the  parallelism  eliminates  the  need  for  back¬ 
tracking  through  the  knowledge  base.  Furthermore, 
since  searching  and  pattern  matching  processors  do 
not  require  high  accuracy  or  large  dynamic  range,  opti 
cal  processors  should  in  principle  be  well  suited  for 
symbolic  logic  processing. 

We  believe,  however,  that  optical  inference  ma 
chines  should  be  designed  to  be  compatible  with  elec 
tronic  computers.  The  goal  should  be  to  exploit  the 
strengths  of  both  systems  so  as  to  realize  hybrid  infer¬ 
ence  -machines  that  are  more  efficient  and  versatile 
than  either  purely  electronic  or  optical  computers. 
For  example,  an  optical  inference  machine  could  po¬ 
tentially  be  integrated  into  an  electronic  fifth-genera¬ 
tion  computer  so  that  a  hybrid  machine  capable  <>i 
operating  at  speeds  in  excess  of  10®  logical  inferences 
per  second  (LIPS)  could  be  produced. 
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Fig.  1.  General  structure  of  an  inference  machine. 


D.  HiStCMy 

Previous  work  in  optical  symbolic  processing  was 
performed  by  several  researchers  in  the  late  1960s  and 
early  1970s.  Gabor,®  Akahori  and  Sakurai,®  Nakajima 
et.  al.,'  and  Lohmann  and  Werlich®  used  holography  as 
the  basis  for  their  processing  techniques.  Willshaw  et. 
al.,^  Willshaw  and  Longuet-Higgins,*®  and  Gabor®  ap¬ 
proached  the  problem  using  associative  network  con¬ 
cepts.  However,  during  the  1970s  and  early  1980s,  the 
emphasis  of  research  on  optical  computing  systems 
shifted  to  numerical  problems  such  as  matrix-matrix 
multiplication,"'*®  array  processing,'-*  and  solving  sets 
of  linear  equations.'® 

More  recently,  there  has  been  a  resurgence  of  inter¬ 
est  in  the  area  of  optical  symbolic  processing. 
Huang'®-'"  has  addressed  the  symbolic  .problem  in  a 
general  sense,  investigating  algorithms  and  architec¬ 
tures  for  performing  symbolic  substitution  optically  in 
classical  finite-state  machines.  Furthermore, 
Huang'®  and  Fisher  et  al.^^  have  recognized  that  there 
may  be  a  possible  role  for  optics  in  symbolic  processors, 
particularly  in  solving  certain  classes  of  artificial  intel¬ 
ligence  problems.  However,  specific  applications  of 
optical  computers  to  symbolic  logic  processing  appear, 
until  now,  to  have  been  unaddressed. 

In  this-paper,  the  concepts  associated  with  symbolic 
logic  processors  are  introduced,  and  the  general  archi¬ 
tecture  of  an  optical  machine  capable  of  inferring  logi¬ 
cal  conclusions  from  a  set  of  facts  and  rules  is  dis¬ 
cussed.  The  general  system  is  approached  from  two 
different  information  flow  patterns;  rule-driven  and 
query-driven  flow.  Two  hybrid  optical  realizations 
for  a  query-driven  inference  machine  are  presented 
which  use  classical  matched-filter  logic  and  mapped- 
template  logic,  respectively.  The  intent  here  is  to 
describe  these  systems  from  a  conceptual  point  of  view. 
Therefore,  no  attempt  is  made  to  address  all  the  issues 
involved  in  realizing  a  practical  system. 

II.  General  Inference  Machine  Architecture 

The  general  structure  of  an  inference  machine  is 
shown  in  Fig.  1.  It  accepts  as  input  a  set  of  facts  - 
set  of  rules  from  the  knowledge  base  and  one  ^  .  .e 
queries.  The  output  of  the  inference  machine  ^  set 
of  specific  conclusions  which  are  logically  .ferred 
from  the  facts  and  rules  in  response  to  the  queries. 

For  example,  a  set  of  data  objects  could  be  a  set  of 
names  of  people.  For  illustrative  purposes,  let  this  set 
be  denoted  as 


_  ^  [Karen,  Beth,  Peg,  Liz,  Sue,  Jean,  Ruth, 

Mike,  Tom,  Bill,  Jim,  Fred,  Bob,  Sami,  ^ 

A  set  of  relationships  for  D  might  be  the  possible 
relationships  between  the  people,  such  as  marriage, 
mother,  father,  male,  and  female.  Let  this  set  of  rela¬ 
tionships  be  denoted  by 

_  jmarried-to,  mother-of,  father-of,  son-of, 
daughter-of,  child-of,'is-male,  is-femalel. 

The  data  objects  and  relationship.s  are  linked  as  a 
collection  of  facts  and  rules  which  relate  the  elements 
of  D  and  R.  In  this  example,  the  facts  could  be  defined 
as 


.Mike  is-male. 

Tom  is-male. 

Bill  is-male. 

Jim  is-male. 

Fred  is-male. 

Bob  is-male. 

Sam  is-male. 

Mike  married-to  Karen. 
Bob  married-to  Beth. 
Jim  married-to  Liz. 


Karen  is-female. 

Beth  is-female. 

Peg  is-female. 

Liz  is-fema'le. 

Sue  is-female. 

■Jean  is-female. 

Ruth  is-female.  (3) 

Bob  father-of  Peg. 

Bob  father-of  Tom. 

Bob  father-of  Jean. 

Jim  father-of  Ruth. 

Fred  father-of  Bill. 


Using  these  facts,  the  remaining  relationships  in  R 
may  be  defined  as  rules.  For  example. 


X  raother-6f  Y  IF 
.Ychild-ofy  IF 
.V  son-of  Y  IF 
K  daughtcr-of  Y  IF 


Z  married-to  K  AND 
Z  father-of  Y, 

Y  mother-of  K  OR 

Y  father-of  Y 
Ychild-of  YAND 

Y  is-male, 

Ychild-of  YAND 

Y  is-female. 


(4) 


where  X,  Y,  and  Z  are  variables.  The  bodies  of  these 
rules  (i.e.,  the  part  to  the  right  of  IF)  consist  of  two 
conditions,  each  of  which  could  be  a  fact  or  another 
rule.  These  conditions  are  then  connected  by  the 
logical  operators  .AND  or  OR.  In  general,  a  rule  could 
have  any  number  of  conditions,  and  a  condition  could 
have  a  logical  NOT  operation  performed  on  it.  For 
example,  the  daughter-of  rule  could  be  modified  to 
use  the  son-of  rule  by  defining  it  with 

.V  daughtcr-of  Y  IF  .Ychild-of  Y. AND 
.VOT  Y  son-of  Y, 


To  satisfy  a  rule,  there  must  be  at  least  one  data 
value  for  all  variables  for  which  all  conditions  are  si¬ 
multaneously  true.  In  the  mother-of  rule,  there  must 
be  at  least  one  value  each  for  X,  Y,  and  Z  so  that  Z  is 
both  married  to  X  and  the  father  of  Y.  Using  the 
format  of  Eq.  U).  additional  relationships  such  as  sis- 
ter-of  and  brother-of  are  straightforward  to  define. 
Together,  the  facts  in  Eq.  (3)  and  the  rules  in  Eq.  (4) 
form  the  knowledge  base. 

In  general,  a  query  into  a  knowledge  base  consists  of 
a  rule  with  at  least  one  variable.  For  example,  a  possi¬ 
ble  query  of  this  knowledge  base  could  be  “Who  is  the 
mother  of  Jean?”,  which  can  be  expressed  as 
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?  mother-of  Jean, 


(6) 

where  ?  represents  the  desired  unknown  data  object. 
From  the  knowledge  base,  only  the  assertion  Beth 
mother-of  Jean  is  true.  Hence  the  conclusion  of  Eq. 
(6)  is  that  the  query  is  true  when  ?  is  the  data  object 
Beth. 

Given  a  query  and  knowledge  base,  conclusions  can 
be  inferred  using  either  inductive  or  deductive  reason¬ 
ing.  In  the  inductive  case,  conclusions  of  a  general 
nature  are  inferred  by  the  application  of  specific  que¬ 
ries  to  the  knowledge  base.  The  cardinality  of  the  set 
of  induced  conclusions  could  in  general  be  quite  large, 
and,  in  principle,  conclusions  not  representative  of  the 
knowledge  base  would  be  possible. 

On  the  other  hand,  deductive  reasoning  produces 
specific  conclusions  from  a  set  of  general  rules  and 
facts,  and  the  conclusions  are  always  a  subset  of  the 
knowledge  base.  For  simplicity  and  practicality,  we 
shall  limit  the  allowed  conclusions  to  the  data  objects 
within  the  knowledge  base.  Therefore,  in  this  paper, 
we  will  consider  only  machines  based  on  deductive 
reasoning. 

Block  diagrams  for  two  general  architectures  of  a 
deductive  inference  machine  are  shown  in  Figs.  2  and 
3.  Both  systems  have  in  common  a  knowledge  base, 
controller,  and  inference  filter.  The  functions  of  the 
controller  are  to  (1)  control  the  flow  of  information 
through  the  inference  machine,  (2)  accept  queries  as 
input  from  the  operator,  and  (3)  transihit  conclusions 
to  the  operator  as  output.  The  knowledge  base  stores 
all  the  data  objects  and  relationships  in  the  form  of 
facts  and  rules.  The  role  of  the  inference  filter  is  to 
generate  a  set  of  all  conclusions  possible  given  a  set  of 
rules  and  facts  front  the  knowledge  base. 

The  system  in  Fig.  2  corresponds  to  a  rule-driven 
inference  machine,  whereas  that  in  Fig.  3  represents  a 
query-driven  inference  machine.  The  systems  are  dis¬ 
tinguished  from  each  other  by  the  methods  they  em¬ 
ploy  to  infer  the  conclusions.  In  the  rule-driven  sys¬ 
tem,  all  possible  assertions  and  facts  from  the 
knowledge  base  are  generated  ab  initio,  and  thereafter 
the  conclusions  are  derived  from  these  inferences  by 
application  of  the  query.  In  contrast,  the  query-driv- 


Fig.  3.  Block  diagram  of  a  deductive  rule-driven  inference  ma¬ 
chine. 
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en  system  first  uses  the  query  to  select  appropriate 
subsets  of  the  rules  and  facts  and  then  infers  specific 
conclusions  from  these  rules  and  facts. 

The  rule-driven  system  of  Fig.  2  approaches  the 
ideal  parallel  system  in  that  the  assertion  generator 
produces  the  facts  and  all  possible  assertions  from  the 
entire  knowledge  base  by  replacing  all  the  rules  with 
appropriate  assertions.  In  the  previous  example,  the 
mother-of,  child-of,  son-of,  and  daughter-of  rules 
would  lead  to  the  assertions 

Beth  mother-of  Peg.  Tom  son-of  Bob. 

Beth  mother-of  Tom.  Tom  son-of  Beth. 

Beth  mother-of  Jean.  Bil|  son-of  Fred. 

Liz  mother-of  Ruth,  (7) 

Peg  child-of  Bob,  Peg  daughter-of  Bob. 

Peg  child-of  Beth,  Peg  daughter-of  Beth. 

Tom  child-of  Bob.  Jean  daughter-of  Bob. 

Tom  child-of  Beth,  Jean  daughter-of  Beth. 

Jean  child-of  Bob,  Ruth  daughter-of  Jim. 

Jean  child-of  Beth,  Ruth  daughter-of  Liz. 

Ruth  child-of  Jim. 

Ruth  child-of  Liz, 

Bill  child-of  Fred, 

Thus  the  output  of  the  assertion  generator  would:  be 
the  set  of  facts  and  assertions  defined  by  Eqs.  (3)  and 
(7).  Note  that  the  knowledge  base  is  not  updated  by 
the  assertion  generator  and  that  the  output  produced 
by  the  assertion  generator  is  computed  only  once. 

As  shown  in  Fig.  2,  the  assertion  generator  of  the 
rule-driven  machine  transfers  the  entire  set  of  facts 
and  assertions  to  an  inference  filter  whose  function  is 
to  match  the  queries  from  the  controller  with  the  facts 
and  assertions  to  determine  the  data  objects  which 
satisfy  the  queries.  After  it  has  determined  the  con¬ 
clusions  for  the  query,  the  inference  filter  transfers  the 
conclusions  to  the  controller  for  output  to  the  operator 

In  the  example  ?  mother-of  Jean,  the  inference 
filter  would  compare  the  facts  and  assertions  defined 
by  Eqs.  (3)  and  (7)  with  the  query  given  in  Eq.  (6i 
Realizing  that  ?  is  the  desired  variable,  the  filter  would 
find  a  match  between  the  query  with  the  third  asser 
tion  given  in  Eq.  (7)  to  obtain  the  answer  Beth.  In  this 
example,  there  was  only  one  possible  conclusion,  but. 
in  general,  several  data  objects  may  satisfy  a  query. 

In  contrast,  the  query-driven  system  of  Fig.  3  is  a 
more  sequential  machine  than  the  rule-driven  system 
of  Fig.  2.  Given  a  query  from  the  operator,  the  control¬ 
ler  uses  the  rules  associated  with  the  query  to  select 
subsets  of  rules  and  facts  from  the  knowledge  base  that 
are  relevant  to  the  query.  In  the  example  of  Eq.  (6), 
the  mother-of  rule  is  associated  with  the  query.  The 
controller  would  examine  the  mother-of  rule  as  de¬ 
fined  in  the  knowledge  base  and  extract  its  condition 
relationships  married-to  and  fatherrof. 

Once  it  has  obtained  the  necessary  subsets  of  rules 
and  facts,  the  controller  transfers  these  subsets  to  the 
inference  filter  along  with  the  known  data  objects  from 
the  query  (Jean  in  Eq.  (6)].  The  inference  filter  then 
matches  the  rules  with  the  known  query  data  to  infer 
the  set  of  data  objects  which  make  the  query  true  (Beth 
for  Eq.  (6)j.  Finally,  the  inference  filter  sends  the 
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Fig.  3.  Block  diagram  of  a  deductive  quer>  driven  inlerent.e  ma¬ 
chine. 


conclusions  back  to  the  controller  for  output  to  the 
operator. 

When  consideration  is  given  to  implementation  of 
an  inference  machine,  the  query-driven  system  may 
appear  more  attractive  than  the  rule-driven  system. 
This  is  because  inferring  the  possible  assertions  and 
storing  all  the  possible  assertions  and  facts  in  the  rule- 
driven  system  could  be  ineTicient,  expensive,  and  dif¬ 
ficult  to  realize,  particularly  for  rules  which  are  recur¬ 
sively  defined  (i.e.,  when  the  rule  has  itself  as  a 
condition).  Consequently,  only  query-driven  systems 
are  considered  in  the  remaining  sections. 

III.  Hybrid  Optical  Realizations 

We  shall  confine  our  discussion  to  optical  inference 
machines  that  complement  the  electronic  computer. 
A  complete  system,  therefore,  will  be  hybrid  in  nature. 
This  places  design  constraints  on  the  input  and  output 
interfacing  devices  of  the  optical  system.  The  opti¬ 
mum  designs,  therefore,  are  those  that  most  effectively 
combine  the  individual  strengths  of  optics  and  elec¬ 
tronics.  Two  query-driven  designs  are  described  be¬ 
low,  the  first  of  which  uses  matched-filter  logic  in  the 
inference  filter,  whereas  the  second  is  based  on 
mapped-template  logic. 

In  these  systems,  the  parallelism  and  speed  of  optics 
are  exploited  to  perform  the  functions  of  searching, 
matching,  and  logic.  The  role  of  the  electronics  is  to 
perform  information  storage  and  retrieve  and  transfer 
data,  rules,  and  operator  queries  to  the  optical  proces¬ 
sor.  Thus  in  Fig.  3  the  inference  filter  is  the  optical 
processor,  while  the  controller  and  knowledge  base 
constitute  the  electronic  support  system. 

To  implement  these  optical  inference  machines, 
three  types  of  optical  devices  are  required:  (1)  an 
input  interfacing  device  which  converts  electrical  sig¬ 
nals  to  2-D  optical  signds;  (2)  an  optical  logic  device; 
and  (3)  an  output  interfacing  device  for  transforming 
optical  signals  into  electrical  signals.  The  input  inter¬ 
facing  device  and  optical  logic  device  should  exhibit  at 
least  short-term  storage. 

In  the  specific  systems  discussed  below,  the  electri- 
cal-to-optical  input  device  could  be  any  2-D  electrical¬ 
ly  addressed  spatial  light  modulator  (E-SLM)  which 


has  short-term  storage,  such  as  the  e-beam  MSLM.^.** 
An  example  of  an  optical  logic  device  which  can  per¬ 
form  2-D  logic  with  memory  is  the  photo  MSLM,-^'-^ 
which  is  an  optically  addressed  spatial  light  modulator 
(0-SLM).  The  logic  operations  that  can  be  performed 
internally  by  the  photo  MSLM  include  AND,  OR, 
NAND,  NOR,  XOR,  and  NOT.  The  optical-to-elec- 
trical  output  device  is  a  2-D  photodetector  array.  To 
obtain  good  noise  rejection  and  low  error  rates,  digital 
optical  signals  (binary  intensity  levels)  are  assumed  for 
all  input  and  output  signals  in  the  optical  processor. 

A.  Matched-Filler  Optical  Inference  Machine 

The  general  matched- filter  optical  inference  ma¬ 
chine  employs  analog  pattern  recognition  techniques 
and  parallel  optical  logic  to  apply  a  set  of  given  rules  to 
a  set  of  facts  to  infer  a  set  of  logical  conclusions  to  the 
queries.  This  method  is  similar  to  the  optical  correlo- 
graph  system  described  by  Willshaw  and  Longuet- 
Higgins.'® 

Figure  4  shows  a  specific  implementation  of  a  query- 
driven  matched-filter  hybrid  optical  inference  ma¬ 
chine.  This  machine  consists  of  an  electronic  control¬ 
ler,  two  E-SLMs,  two  0-SLMs,  and  a  photodetector 
array  which  is  operated  in  a  thresholding  mode.  In 
this  and  subsequent  figures,  it  should  be  noted  that  ( 1 ) 
the  input  light  to  the  0-SLM  is  absorbed  within  the 
device  and  is  not  transmitted,  and  (2)  the  readout  light 
is  reflected  out  of  the  device  by  an  internal  mirror. 

In  thematched-filter  system  of  Fig.  4,  the  facts  and 
rules  are  grouped  in  block  form  (subsets)  and  stored 
electronically  in  the  controller  for  rapid  retrieval  and 
transfer  to  the  optical  system.  The  two  E-SLMs.  0- 
SLM  1,  the  lenses  Li,  L2,  and  L3,  and  the  photodetector 
array  are  arranged  to  form  a  classical  VanderLugt 
matched-filter  system.-^  Thus  lens  L\  is  one  focal 
length  away  from  the  planes  Pi  and  Pj,  lens  L2  is  one 
focal  length  away  from  planes  Pj  and  P4,  and  lens  L 1  is 
one  focal  length  away  from  planes  Pj,  P.5,  and  Pe.  The 
multiplication  of  the  Fourier  transforms  of  the  signals 


Fig.  t.  .\l.iuhe(l  filter  optical  inference  machine. 
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to  be  matched  is  performed  in  0-SEM  1,  and  the 
matched-filter  output  is  recorded  on  the  photodetec¬ 
tor  array  (shutter  Si  dpen.  Sa  closed).  The  photode¬ 
tector  then  transfers  its  output  to  the  controller. 

If  the  query  dictates  that  several  rules  must  be  ap¬ 
plied  to  the  facts  in  succession,  the  resulting  matched- 
filter  outputs  can  be  combined  by  using  the  optical 
logic  capabilities  of  0-SLM  2.  With  Si  closed  and  S2 
open,  the  logic  output  of  0-SLM  2  can  be  imaged  onto 
the  photodetector  array  using  lens  L4  and  the  photode¬ 
tector  output  fed  back  to  the  controller.  This  ability 
permits  rules  to  be  applied  as  many  times  as  necessary 
to  various  subsets  of  facts  to  generate  the  logical  con¬ 
clusions. 

When  operating  the  matched-filter  optical  inference 
machine,  the  operator  queries  the  system  through  the 
electronic  controller.  In  response,  the  controller 
writes  the  applicable  subsets  of  facts  onto  E-SLM  1 
and  the  applicable  subset  of  rules  onto  E-SLM  2.  This 
information  is  coded  as  a  set  of  predetermined  2-D 
binary-level  patterns.  In  the  query  example  of  Eq.  (6), 
the  mother-of  rule  and  the  complete  set  of  facts  in  Eq. 
(3)  would  be  the  applicable  sets. 

The  controller  then  activates  0-SLM  1  which  holo¬ 
graphically  records  the  Fourier  transform  of  the  facts 
as  formed  by  lens  Lt.  The  rules  are  similarly  trans¬ 
formed  by  lens  Lo,  and  this  transform  is  used  to  read 
out  0-SLM  1  via  mirror  Mi  as  shown  in  Fig.  4.  The 
output  of  0-SLM  1  is  transformed  by  lens  L3  to  form 
the  matched-filter  output  on  the  photodetector  array. 
This  output  consists  of  a  set  of  focused  spots  of  light 
which  indicates  the  positions  of  the  matches.  These 
signals  are  then  stored  in  0-SLM  2  and/or  fed  back  to 
the  controller,  which  then  uses  this  input  to  select  the 
possible  conclusions  from  the  set  of  facts. 

Several  options  exist  at  this  point,  depending  on  the 
nature  of  the  query  being  solved.  For  example,  the 
controller  could  now  load  another  part  of  the  query 
into  E-SLM  2,  perform  a  second  matched-filtering 
operation,  and  with  S,  closed  and  S2  open  perform  a 
logical  AND  (with  0-SLM  2)  of  the  second  correlation 
and  the  first  which  is  already  stored  in  0-SLM  2.  The 
output  of  0-SLM  2  would  then  be  read  out  onto  the 
photodetector  array.  Thus  the  matched-filter  infer¬ 
ence  machine  is  capable  of  sequentially  performing  all 
combinations  of  2-D  optical  pattern  correlations  and 
binary  level  logic  operations  on  patterns  representing 
the  data  objects,  rules,  facts,  and  queries. 

To  solve  the  ?  mother-of  Jean  query  in  Eq.  (6),  this 
system  would  first  examine  the  query  for  the  specified 
data  objects  (Jean  in  this  case)  and  would  then  treat 
the  mother-of  rule  as  if  its  variables  were  replaced  by 
the  appropriate  data  objects.  In  this  case,  the  effec¬ 
tive  mother-of  rule  would  become 

?  mother-of  Jean  IF  Z  married  to  ?  AND 
Z  fathcr-of  -Jean. 

Comparing  Eq.  (8)  with  the  original  mother-of  rule  as 
defined  in  Eq.  (4),  the  variables  X  and  V  have  been 
replaced  with  the  desired  unknown  symbol  ?  and  the 
data  object  Jean,  respectively.  Since  the  mother-of 
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rule  has  two  conditions,  the  controller  has  to  invoke 
two  matched-filtering  operations. 

The  order  in  which  the  conditions  are  satisfied  does 
not  matter  since  all  of  them  must  be  true  for  the  nuery 
to  be  satisfied.  Since  the  second  condition  s  the 
data  object  Jean  as  a  constraint,  the  first  matched- 
filtering  operation  matches  the  father-of  facts 
(placed  on  E-SLM  1)  with  the  data  object  Jean  (placed 
on  E-SLM  2).  The  output  of  the  matched-filter  is 
then  a  representation  of  all  facts  associated  with  the 
condition  father-of  Jean.  In  this  case,  there  is  only 
one  fact  associated  with  this  condition.  Bob  father-of 
Jean.  The  controlletthen  retrieves  the  father’s  name 
Bob  and  matches  the  condition  Bob  married-to  with 
the  set  of  facts.  The  second  matched-filter  output 
points  to  the  fact  Bob  married-to  Beth.  Finally,  the 
controller  simply  associates  the  conclusion  Beth  with  ? 
and  returns  the  conclusion  to  the  operator. 

In  the  case  where  there- are  several  matches,  it  is 
possible  for  the  controller  to  match  all  the  resulting 
conclusions  with  the  next  condition  for  full  parallel¬ 
ism.  Furthermore,  if  no  match  is  made  (i.e.,  no  spots 
of  light  above  threshold  on  the  photodetector  array), 
the  condition  cannot  be  satisfied,  making  the  quer> 
false. 

The  block  electronic  storage  scheme  suggested  here 
is  not  the  most  efficient  means  of  storing  the  rules  and 
the  facts  because  a  single  data  object  may  be  associated 
with  several  different  facts.  However,  because  elec¬ 
tronic  storage  is  relatively  inexpensive,  block- form 
storage  does  not  appear  to  be  inappropriate  for  the 
initial  investigations  of  these  machines. 

Since  data  objects  are  not  expected  to  change  often, 
partitioning  the  knowledge  base  into  blocks  will  gener 
ally  not  have  to  be  done  frequently.  The  advantage  of 
block  electronic  storage  is  that  it  not  only  reduces  the 
data  acquisition  and  retrieval  time  but  also  eliminates 
the  need  to  transfer  the  entire  knowledge  base  to  the 
spatial  light  rnodulators  which  currently  have  onK 
modest  space-bandwidth  products. 

B.  Mapped-Template  Optical  Inference  Machine 

In  the  mapped-template  optical  inference  machine 
mapping  templates  are  used  to  store  the  relationship' 
between  the  data  objects  and  are  thus  defined  by  the 
facts.  Conclusions  are  inferred  to  queries  by  apply 
these  mapping  templates  to  the  data  objects  in  the 
order  prescribed  by  the  rules.  This  usage  of  mapping 
templates  is  similar  to  the  associative  nets  described 
by  VVillshaw  and  Longuet-Higgins.‘° 

Using  the  example  defined  by  Eqs.  (l)-(6),  the  rela¬ 
tions  is-male,  is-female,  married-to,  and  father-of 
froth  the  facts  in  Eq.  (3)  would  map  an  input  set  from 
D,  the  set  of  data  objects  defined  in  Eq.  (1),  to  an 
output  set,  also  from  D.  Let  D;  and  Do  represent  the 
input  and  output  sets  of  data  objects.  Furthermore, 
let  the  data  objects  in  the  mth  position  of  D,-  and  D„  be 
denoted  by  d,m  and  dom-  Using  the  data  set  D  for  D 
and  Do  as  defined  in  Eq.  (1),  the  mapping  templates 
corresponding  to  the  is-female  and  father-of  facts  a- 
defined  in  Eq.  (3)  are  shown  in  Fig.  5  with  the  elements 
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Eig.  5.  Mapping  templates  for  (a)  the  is-female  facts  and  ib)  the 
father-of  facts  for  the  entire  set  of  data  objects  in  Eq.  i  U. 


of  the  input  set  D,(ci,>)  along  the  columns  {x  axis)  and 
the  elements  of  the  output  set  Do(doj )  along  the  rows  ty 
axis)  of  the  templates. 

The  mappiiig  templates  are  binary  masks  consisting 
of  transparent  squares  (logical  1  and  shown  as  black 
squares  in  Fig.  5)  on  an  opaque  background  ilogical  0 
and  shown  as  white  in  Fig.  5).  The  interpretation  of 
these  templates  is  as  follows:  A  transparent  square  in 
the  (x,  y)  position  of,  say,  father-of  indicates  the  fact 

d,,  father-of  d„y  (9) 

Given  these  two  templates,  the  mapping  templates  for 
is-male  and  married-to  are  straightforward  to  gener¬ 
ate. 

Note  that  the  mapping  between  D;  and  Dg  is  not 
necessarily  one-to-one.  However,  a  mapping  tem¬ 
plate  is  reciprocal  in  that  if  the  right-hand  side  of  Eq. 
(9)  is  specified  instead  of  the  left,  the  relationships  for 
the  left-hand  side  may  be  inferred  from  the  template. 

Alternatively,  to  limit  the  size  of  the  mapping  tem¬ 
plate  and  conserve  space,  D  could  be  subdivided  into 
subsets  whose  data  objects  are  related  in  some  way. 
Considering  the  facts  in  Eq.  (3),  it  is  reasonable  to  split 
D  into  a  set  of  males  and  a  set  of  females  denoted  by 

=  I.Mike.Tom,  Bill.  Jim.  Fred,  il-  b.  s.iml  ^ 

Df  =  iKaren.  Beth.  Peg.  Liz.  Sue.  b-.m.  Huihl, 

where  D,n  and  D/  represent  the  male  and  female  sets, 
respectively.  With  these  subsets,  the  relationships  is- 
hiale  and  is-female  would  no  longer  be  needed. 

With  the  data  set  partitioned,  the  mapping  tem¬ 
plates  for  the  factual  relationships  between  the  ele¬ 
ments  of  Dm  and  D/  would  simply  be  the  corresponding 
regions  in  the  original  full-size  mapping  templates  in 
Fig.  5.  For  the  relation  is-female  and  the  data  set  Dm, 
the  template  would  always  be  opaque. 

To  perform  logical  inferring,  the  mapping-template 
concept  is  implemented  as  illustrated  in  Fig.  6.  Given 
an  input  vector  D,-,  the  associated  output  vector  D,  for 
a  particular  mapping  template  is  found  by  first  verti¬ 
cally  expanding  D/  along  the  y  axis  so  it  forms  an  array, 
each  row  of  which  equals  D,-,  as  shown  in  Fig.  6.  This 
expanded  form  of  D,'is  then  optically  overlaid  with  the 
mapping  template  using  imaging  optics  and  a  2-D  logi¬ 
cal  -KnD  operation  is  performed.  The  resulting  out¬ 


put,  when  viewed  along  the  rows,  corresponds  to  the 
output  vector  Dg. 

To  perform  the  reciprocal  operatioii  of  the  mapping 
template,  the  input  vector  would  be  expanded  horizon¬ 
tally  and  logically  ANDed  with  the  mapping  template. 
The  output  vector  would  then  be  taken  looking  down 
the  columns. 

Depending  on  the  mapping  template,  it  is  possible 
for  multiple  inputs  in  D/  to  produce  the  same  output 
element  in  Dg.  For  this  reason,  a  2-D  output  photode¬ 
tector  array  is  used  for  establishing  the  exact  input-to- 
output  correspondence,  should  this  be  needed  in  solv¬ 
ing  the  query. 

A  hybrid  optical  inference  machine  which  imple¬ 
ments  mapped-template  logic  is  shown  in  Fig.  7.  It 
consists  of  an  electronic  controller,  two  E-SLMs,  two 
0-SLMs,  and  a  2-D  photodetector  array.  Like  the 
matched-filter  optical  inference  machine,  the  control¬ 
ler  in  this  system  electronically  stores  the  knowledge 
base  and'  controls  the  SLMs  and  the  shutter.  The 
modulator  0-SLM  1  is  operated  in  the  logic  mode  and 
usually  performs  the  AND  operation,  while  O-SLM  2 
is  used  as  a  2-D  memory  unit  to  allow  further  process¬ 
ing  of  the  outputs,  and  is  optional. 

When  the  controller  is  given  a  query  by  the  operator, 
a  vertical  line  is  written  on  E-SLM  1  at  the  location  of 
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the  known  data  objects  in  D,-.  Then  the  controller 
writes  the  mapping  template  corresponding  to  the  rule 
(or  first  condition)  associated  with  the  query  onto  E- 
SLM  2.  The  outputs  of  both  E-SLMs  are  imaged  onto 
0-SLM  1  with  lens  Li.  The  logical  AND  of  the  two 
inputs  is  formed  in  0-SLM  1  and  imaged  onto  the 
photodetector  array  by  lenses  L2  and  L3.  If  desired, 
the  output  could  also  be  imaged  onto  0-SLM  2  by  lens 
L2  and  latched.  The  stored  output  in  0-SLM  2  could 
then  be  imaged  via  lens  L4  back  into  0-SLM  1  by 
opening  shutter  S  should  further  processing  be  neces¬ 
sary. 

The  output  of  the  photodetector  array  is  fed  back  to 
the  controller  where  the  inferred  data  objects  in  Do 
which  satisfy  the  current  mapping  rule  are  deter¬ 
mined.  Further  mapping  templates  are  then  applied 
by  the  controller  as  determined  by  the  query  and  rules. 

Operation  of  this  optical  inference  machine  can  be 
demonstrated  for  the  ?  mother-of  Jean  query  in  Eq. 
(6).  As  with  the  matched-filter  machine,  the  mapped- 
template  system  considers  the  effective  form  of  the 
mother-of  rule  given  the  data  object  Jean  as  specified 
in  Eq.  (8).  The  controller  first  Uoes  the  mapping  rule 
template  for  father-of  as  shown  in  Fig.  5  and  the  input 
vector  corresponding  to  Jean,  which  is,  from  Eq.  (1),  [0 
00000100000000).  Since  Jean  is  specified  on  the 
output  side  of  father-of,  the  input  vector  is  expanded 
horizontally  rather  than  vertically  on  E-SLM 1.  Scan¬ 
ning  the  rows  of  the  output  array  produces  the  output 
vector  (00000000000010)  which  corresponds  to 
Bob. 

The  controller  then  feeds  this  output  vector  back  to 
E-SLM  1  as  the  input  vector  for  the  married-to  map¬ 
ping  template.  Since  this  input  is  on  the  right  side  of 
the  married-to  rule,  the  vector  is  expanded  vertically 
on  E-SLM  1.  The  inference  operation  is  repeated  with 
the  married-to  mapping  template  on  E-SLM  2,  pro¬ 
ducing  the  output  vector  (view  along  the  columns)  (01 
00000000000  0),  which  indicates  the  conclusion 
Beth. 

Since  multiple  outputs  for  the  same  data  object 
could  be  generated,  viewing  the  rows  or  columns  of  the 
output  array  could  lead  to  an  integral  multiple  of  a 
single  light  beam  intensity.  In  this  case,  the  photode¬ 
tector  output  is  electronically  clipped  to  the  single 
light  beam  level  if  the  photodetector  output  is  to  be  fed 
back  to  E-SLM  1  as  input  via  the  controller. 

If  the  optional  optical  feedback  loop  is  not  used, 
there  is  a  possible  modification  to  the  system  in  Fig.  6 
which  will  simplify  the  device  requirements.  Instead 
of  performing  the  logical  AND  operation  in  0-SLM  I, 
the  output  of  E-SLM  1  (the  expanded  input  data  vec¬ 
tor)  could  be  used  to  read  out  the  mapping  template  in 
E-SLM  2,  thus  eliminating  the  need  for  a  2-D  optical 
logic  device.  However,  the  advantage  of  having  O- 
SLM  i  is  that  (1)  it  can  conveniently  perform  the 
logical  NOT  operation  on  a  condition,  and  (2)  the 
processed  patterns  are  automatically  latched  into  0- 
SLM 1.  This  allows  the  controller  to  begin  setting  up 
the  next  mapping  template  while  it  simultaneously 
reads  the  photodetector  array,  thus  providing  some 


degree  of  concurrent  operation. 

Further  possibilities  for  increasing  processing  speed 
are  to  place  multiple  mapping  templates  which  are 
spatially  separated  from  each  other  on  E-SLM  2.  The 
input  data  vectors  on  E-SLM  1  would  have  to  be^repo- 
sitioned  accordingly.  However,  multiple  inferences 
could  then  be  made  in  parallel. 


IV.  Concluding  Remarks 

Basic  architectures  for  a  hybrid  optical  machine  ca¬ 
pable  of  solving  symbolic  logic  problems  have  been 
discussed  in  general  terms,  This  inference  machine 
was  considered  from  both  a  rule-driven  and  query- 
driven  approach.  Two  hybrid  optical  designs  of  a 
query-driven  inference  machine  were  described  which 
used  matched-filter  logic  and  mapped-template  logic. 

In  comparing  the  two  designs,  the  mapped-template 
system  should  be  less  demanding  on  the  spatial  resolu¬ 
tion  characteristics  of  the  spatial  light  modulators  and 
should  be  easier  to  implement  than  the  matched-filter 
machine.  Furthermore,  the  mapped-template  system 
should  have  better  noise  performance  since  there  is  no 
analog  processing  in  this  system.  That  js,  all  optical 
signals  remain  encoded  as  binary  intensity  levels  in  the 
mapped-template  system,  whereas  the  matched-filter 
system  must  contend  with  the  noise  from  the  analog 
matched-filtering  process,  even  though  binary  intensi¬ 
ty  input  and  output  patterns  are  used. 

.Although  two  hybrid  architectures  have  been  pre¬ 
sented,  other  equally  effective  system  designs  are  pos¬ 
sible.  Given  the  growing  interest  in  integrating  sym¬ 
bolic  logic  processing  into  the  computer  of  the  future, 
the  idea  of  downloading  the  inference  operations  of 
scanning,  searching,  and  matching  to  a  parallel  optical 
processor  merits  continued  investigation. 

This  work  was  supported  in  part  by  the  Air  Force 
Office  of  Scientific  Research  under  grant  AFOSR-84- 
0358. 
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ABSTRACT 

Knowledge  base  systems  (K OS’s)  arc  becoming  iiu  r«-.i.-MngK 
more  important  for  many  scientific  and  engineering  appli 
cations.  Over  the-past  few  years,  several  rcse<irihtr»  l,.ii.e 
considered  optics  for  implementing  KBS’s  in  an  aiicmpt  to 
capitalize  on  the  potential  speed  and  paralleli.-<iii.  fins  pa¬ 
per  presents  a  review  of  recent  research; eITjrl-.  .uu.ig  aiih 
a  discussion  of  the  relative  merits  and  Imiitalixii.v  uf  using 
optics  as  an  implementation  technology.  To  a  l.xgi  i.vtcnt. 
past  efforts  have  focused  on  (1)  represcntiiig  knoivledge  .is 
ing  matrix-like  formalisms  and  (2)  designing  s.siiin  <tr..hi- 
tcctures  based  on  optical  inner  product  processors  (such  as 
matrix-vector  multipliers)  and  optical  correlators.  .Actual 
implementations  arc  impeded  primarily  by  the  liimtatiuns  of 
current  spatial  light  modulators.  Xew  directions  include  the 
use  of  symbolic  substitution  and  neural  nctnuik  ideas. 


INTRODUCTION 

A  knowledge  base  system  (KBS)  manipulates  symbolic 
informiition  to  produce  useful  output  conclusion'  given  input 
queries  or  requests.  Three  familiar  c.xamplis.  .-.re  datab.asc 
managers,  relational  database  systems,  and  inference  m.v 
chine.'i  (such  as  expert  systems).  The  knowlolge  ii.cse  con¬ 
sists  of  sets  of  symbols  and  relationships  between  them.  The 
allowed  operations  arc  determined  by  the  type  of  KBS  being 
considered.  For  example,  if  the  knowledge  b.-c<e  i'  a  database 
(or  relational  database)  containing  records  of  information, 
typical  operations  include  sorting  and  searching  on  specific 
fields  within  a  database  record.  For  an  inferenre  machine, 
the  knowledge  base  is  a  collection  of  symbols  and  relation¬ 
ships  arranged  as  sets  of  facts  (axioms)  and  rules  Besides 
searching,  a  fundamental  operation  of  such  a  KBS  is  infer¬ 
ence  (usually  deductive).  If  the  facts  and  rules  arc  focused 
on  a  specific  area  of  knowledge;  the  KBS  is  known  .as  an 
expert  system. 

Conventionally,  KBS's  have  been  implemented  in  soft 
ware  on  electronic  "computers.  The  primary  languages  h.avc 
been  LISP,  PROLOG,  and  with  specialized  hardware.  P.AR- 
.ALOG  (l,2j.  V/itb  the  software  approach,  the  knowledge 
base  and  its  a*''i',ia,sd  operations  are  programmed  into  the 


KBS.  .More  recently,  a  connectionistic  (neural  network)  apr 
pro.ach  h.as  been  applied  to  an  inference  machine  KBS  to  re¬ 
alize  a  trainable  expert  system  [.3].  This  method  allows  facts 
and  rules  to  be  learned  by  a  KBS  using  a  train  by  example 
procedure  on  known  sets  of  input  queries  and  output  conclu 
sions. 

While  software  b^lsed  KBS's  have  been  quite  adequate 
for  small  to- medium  size  knowledge  bases,  their  performance 
tan  be  degraded  significantly  by  large  and  especially  very 
large  knowledge  bases.  To  process  vast  amounts  of  informa¬ 
tion,  a  large,  fast  memory  that  can  be  searched  efficiently  is 
required.  Herein  lies  the  potential  of  an  optical  implementa¬ 
tion  of  a  KBS. 

Optics  offers  a  high  degree  of  parallelism  with  its  natural 
two  dimensional  data  path.  This  allows  for  efficient  paral 
Icl  seiirching  technii|.ics,  particularly  when  holographic  stor 
age  methods  are  employed.  Furthermore,  since  light  beams 
can  intersect  with  negligible  interaction,  large  numbers  of  in¬ 
terconnections  between  processing  components  can  be  made 
with  more  flexibility  than  with  electronic  wires.  To  illustrate 
the  severity  of  the  wiring  problem,  consider  an  optical  data 
path  consisting  of  1000  x  1000  pixels.  Forming  intercon¬ 
nections  between  two  planes  with  this  format  is  relatively 
str.aighlforward  for  an  optical  system.  However,  electroni¬ 
cally  iiHerconneriing  tin-  10*  locations  in  one  plane  with  the 
10*  loc.atioiis  in  the  .sccoiiil  plane  could  require  lO*’  wires 
III  the  worst  rase  (i.e..  fully  interconnected).  Such  a  high 
wiring  density  is  elfcctively  impractical  for  current  VH.Sl 
(Very  Large  Sr.alc  Integration)  technology. 

In  this  paper,  we  present  an  overview  of  the  basic  ap¬ 
proaches  being  considered  for  realizing  optical  KBS's.  The 
tiiscussion  begins  with  a  summary  of  the  techniques  for  rep¬ 
resenting  symbolic  information  optically  and  continues  with 
a  survey  of  the  available  optical  hardware.  Then,  descrip¬ 
tions  of  selected  optical  architectures  arc  given,  followed  by 
a  discussion  of  the  relative  strengths  and  limitations  of  these 
optical  systems.  The  interested  reader  is  referred  to  Ref.  -I 
for  another  view  on  the  impact  of  optics  on  KBS's. 

OPTICAL  KNOWLEDGE  BASE  SYSTEMS 

Representation  of  Symbolic  Information 

One  of  the  fundamental  design  criteria  for  a  KBS  is  how 
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Figure  1  Example  field  structure  for  a  corporatiorr  database 
record. 


Figure  3.  DifFractive-type  memory  for  storing  the  knowledge 
base. 
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Figure  2.  Example  record  structure  fur  a  lurpuration 
database.  Eacii  corporation  has  a  record  entry  like  that 
shown  in  Fig.  1. 
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binary  relationship  matrix. 


symbolic  information  is  represented.  In  a  database  applica 
tion,  the  database  usually  is  organized  as  an  array  of  records, 
each  of  which  is  decomposed  into  a  set  of  fields.  For  example, 
in  a  database  of  corporations,  each  record  could  have  fields 
such  as  corporation  name,  address,  telephone  number,  etc. 
In  an  optical  database  system,  the  fields  could  be  arranged 
in  a  suitable  spatial  pattern  such  as  that  shown  in  Fig.  I,  and 
this  pattern  would  constitute  a  record  within  the  database. 
An  array  of  records  could  be  organized  into  a  matrix  as  illus 
trated  in  Fig.  2,  thus  resulting  in  a  two  dimensionai  storage 
format  for  the  database. 

In  an  optical  inference  machine,  bulb  '.he  basis  symbols 
of  information  and  the  relationships  lniwicr.  them  (vchich 
form  the  facts  and  rules)  arc  encoded  in  iln*  knowledge  base. 
There  arc  primarily  two  methods  for  «i.ir.ng  this  iiiforma 
tion.  diffractive  ly  pc  memories  and  rtlat, unship  matrires. 
.'\s  shown  in  Fig.  3,  a  diffractive  type  mriiiury  is  an  jptiial 
subsysicm  that  processes  an  input  iighi  fiisiribution  using 
diffraction  to  produce  an  output  light  distribution.  The  in¬ 
put  distribution  represents  cither  ( 1 )  the  .symbol.  (2)  a  collec¬ 
tion  of  symbols,  or  (3)  a  portion  of  a  symbol.  The  diffracted 
output  then  corresponds  to  the  associated  or  inferred  sym 
bol  imph'cd  by  the  input  symbol  in  the  first  two  cases  or  the 
completed  symbol  in  the  third  case.  Two  examples  of  this 
type  of  storage  medium  arc  holographic  associative  memo 
rics  (both  hctcro  and  auta  associative)  and  matched  spatial 
filters  (MSF’s). 

In  a  relationship  matrix,  a  symbol  of  informatiun  rep. 
resented  by  a  row  and  a  column  within  the  mairi.x.  .\n  ex 
ample  formal  is  shown  in  Fig.  -I.  The  value  of  a  matrix  cl- 
ement,  T,j,  corresponds  to  the  relative  amount  that  symbol 
Sj  implies  symbol  s,.  In  principle,  T,.  could  be  a  continuous 


value,  but  in  Fig.  1,  it  is  shown  as  a  binary  value.  Witli 
a  relationship  matrix,  the  inference  process  is  very  simiiai 
to  an  inner  product  operation  like  matrix-vector  multipiua 
lion.  For  example,  a  binary  input  vector  s  has  Sj  —  I  f..i 
each  symbol  j  that  is  active  (0  otherwise).  For  a  partu  • 
lat  relationship  matrix  T,  the  set  of  output  symbols  implii-u 
by  the  selected  input  symbols  is  given  by  thresholding  i.i> 
matrix- vector  product  Ts  to  form  a  binary  output  vcttoi 
The  thresholding  operation  is  necessary  to  make  the  input 
,vnd  output  vixlors  compatible  so  other  relationship  inair, 
CCS  may  be  applied  to  generate  further  conclusions.  The  r»- 
lalionship  matrix  formalism  includes  the  mapping  tempirti* ' 
tlot  nbt-d  111  Ref,  •>  and  the  directed  graphs  ami  ad;ai  • - 
matrices  m  Refs.  t«  ?. 

Ry  tuiiihining  a  •lilfravtivc-iype  memory  with  a  rcl.ii... 
*liip  iii.-.itix.  iii'ite  tie.xibilily  IS  gamed  for  storing  and  ,i.  ■  f 
m.g  a  kru»wl«ige  base.  This  i.s  the  appro-ach  usc*l  by  M-.»-r  • 
optical  KUS's. 

Optical  Hardware 

There  arc  a  variety  of  general-purpose  optica!  dcvnt-» 
that  can  be  used  in  an  optical  KBS.  Typically,  an  optirai 
KBS  IS  a  h.ybrid  optical/elcctronic  system  with  the  electron 
ICS  serving  to  control  the  optics  and  to  provide  an  inter 
face  between  the  user  and  the  optics.  Consequently,  four 
•  lasses  of  devices  arc  needed.  clectrica!-to*optical,  oplitai 
t«>-oplical,  optieaI-lo-c!eelrical,  and  specialised.  Bccaust- ..{ 
iis  unique  coherence  properties,  the  laser  is  usually  the  iict  • 
source  of  choice  and  should  be  assumed  below  unless  oihn 
wise  specified. 

E!ecttica!-lo-oplica!  devices,  called  electrically -addt<-'»« 
spatial  light  modulators  (E-SI.M  sj,  convert  cleclromr  •  c 
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naU  from  the  TOnirollcT  into  suitable  optical  aignak.  For  one- 
diinusifioiial  (l-D)  inputs  (vectors),  acousto-optic  light  mod¬ 
ulators  and  linear  arrays  of  light-emitiing  diodes  (LED’s) 
or  laser  diodes  can  be  used.  Two-dimensional  (2-D)  inputs 
(matrices)  can  be  realized  with,  commeroially  available  E- 
SLM's  er.oh  as  the  LightMorl  (9].  With  some  modification, 
portab!eiiquid-cry3tal-display  (IiCD)  televisions  also  can  lie 
employed  |10-i2]. 

Optical-to-optical  devices,  optically-addressed  spatial 
light  module.tors  (0-3LM’s),  can  be  used, as  optical  niemcH 
ries  to  store  intermediate  '•.onclusions  or  as  active  processing 
elements.  Commercially  available  0-SLM’s  include  the  mi- 
<rochaniiel  spatial  light  modulator  (MSLM),  the  liquid  crys- 
tal  light  value  (LCLV),  and  the  Pockels  readout  optical  mod¬ 
ulator  PP;OM)  (9j.  Depending  upon  their  internal  design, 
these  0-SLM’s  can  differ  significantly  in  the  types  of  ac¬ 
tive  processing  operations  offered.  For  example,  the  MSLM 
can  perform  logic  functions  and  thresholding  and  has  long¬ 
term  storage  for  both  analog  and  binary-level  images  [13j. 
By  comparison,  the  LCLV  also  can  implement  logic  func¬ 
tions  and  thresholding  but  only  has  very  short-term  slor- 

(14,15). 

in  addition  to  these  commercial!;'  available  SLM’s, 
many  other  SLM’s  (both  electrical-lo-opticil  and  optic  al¬ 
to-optical)  are  currently  under  development  |15),  including 
bistable  optical  devices  (BCD’s)  {16). 

Optical-to-eleclrical  devices  (optical  detefctots).v.'ansform 
a  2-D  light  distribution  into  a  set  of  serial  or  parallel  elec¬ 
tronic  signals  that  represent  the  output  coriclu„ions  of  the 
optical  KBS.  Because  of'their  usefulness  in  many  other  {ap¬ 
plications,  optical-to-electrical  devices  ate  the  most  well- 
developed  of  those  discussed  so  far.  E.-ran-pies  of  these 
devices  include  silicon  photodeteclor  atmys  and  Uiarge- 
coupled-dcvice  (CCD)  cairicras. 

Specialized  devices  perform  dedicated  task.v  that  cannot 
be  achieved. easily  with  any  SLM.  Besides  conventional  op¬ 
tical  components  such  as  lenses,  mirrors,  and  beamsplitters, 
fixed  filler  masks  can  be  made  from  phoiogtapbic  fil,,i.  Static 
diffractive  elements  can  be  formed  using  holographic  film  and 
dynamic  diffractive  elemeius  can  be  reaiucd  using  photorc 
fractive  crystals  (17). 

Fundamental  Optical  Architectures 

Diffradivc-Type  Memory  The  basic  architecture  for 
implementing  a  dilFraciive-type  memory  is  the  correlator, 
shown  in  r  ig.  5,  ft  consists  of  two  Fourier  transforming 
lenses,  Li  and  Lj,  and  a  diffractive  element  arranged  as  a 
coherent  optical  processor.  The  first  lens-forms  the  2-D  spa- 
lialrPourier  transform  Uin{u,  v)  of  the  input  tight  distribution 
Uiaixty)  at  the  filter  plane.  Here,  a  diflf.-iciivc  optical  ele¬ 
ment  such  as  a  holographic  filter  o;  photorofractive  crystal  or 
any  other  type  of  optical  filter  is  placed.  The  transmittance 
of  tliis  element^  multiplies  the  tiansfornii  Uia{u,v). 

The  light  distribution  exiting  the  diffractive  element, 
f/in(u,v)i/M(u,f),  is  transformed  back  from  the  spatial 
Fourier  domain  into  the  space  domain  by  the  second  lens 
to  produce^the  output  distribution  (/oui(a’,.!/)-  If  the  diffrac- 
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Figure  5:  Basic  optical  correlator  (space-invariant). 
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Figure  6,  Optical  inner  product  processor  for  implementing 
relationship  matrices.  Expansion  optics  not  shown. 


live  clcrnent  is  a  matclied  spatial  filter  (MSF),  the  output 
plane  will  contain  spots  of  iighl-.at-the  locations  in  the  input 
plane  of  Ihe-pattern  being  matched. 

For  example,  in  the  database  record  shown  in  Fig.  2, 
if  the  diffractive  element  was  an  MSF  of.a  city  name  and 
tlie  database  of  coiporatipu  records  was  placed  in  the  input 
plane,  then  all  corporations  'i  that  pailicular  city  wouM  be 
indicated  by  spots  ollighfc  in  the  output  plane.  These  light 
beams  would  be  located  at  the  positions  of  the  •’city”  field  'n 
the  inalchiag  records.  This  example  illustrates  the  parallel 
searching  capabilities  of  optics. 

The  configuration  shown  in  Fig.  5  ha?  ail  planes  and 
lenses  separated  by  the  focal  length  /  This  setup  per 
forms  space-mvarianl’proi-essing  •Ahere'oy  a  shift  in-the  input 
[',n{<P,y)  oniy  causes  the  output  Unut{T:y)  to  shift  accop-i 
ingiy.  However,  other  elements  (lenses,  filters, .etc.)  may  be 
added  in  addition  to  the  distances  being  varied  (appropri 
ately)  to  make  a  space- variant  processor.  In  this  case,  the 
output  is  not  shift-invariant  but  will  change  as  the  input 
light  distnbu  lion  is  shifted. 

Rdationslup  Matrices  Since  relationship  matrices  are 
piocessed  via  a  matrix-vector  inner  product  with  perhaps 
a  nonlinear  thresholding  on  the  result,  this  encoding  scheme 
can  be  implemented  using  the  basic  matrix-vector  processor 
shown  ill  Fig.  6.  A  vectopof  light  beams  representing  a  set  of 
symbols  k  presented  to  Ihesystcrr  via  a  l-D  SLM.sEsxh  light 
beam  is  spread  lioiizon tally  using  cylindrical  or  fiber  optics 
(not  shown)  across  a  filler  mask  whose  transmittances  are 
related  to  the  values  in  a  relationship  matrix.  Tliii;  mask 
can  be  implemented  using  another  SLM. 

The  light  then  is  focused  vertically  (using  cylindrical  or 


fiber  optics  again)  onto  a  1-D  0-SLM  or  a  linear  photode¬ 
tector  array.  Any  thresholding  operation  that  is  needed  is 
performed  by  this  device.  Because  of  the  crossed  cylindrical 
(or  fiber)  optics  and  the  multiplicative  transmittance  mask, 
this  architecture  implements  the  matrix- vector  inner  product 
with  complete  parallelism. 

Depending  upon  the  type  of  relationship  matrix  em¬ 
ployed,  it  is  also  possible  to  implement  an  effective  inner- 
product-type  operation  using  the  basic  optical  correlator  in 
Fig.  5.  This  method  usually  requires  the  correlation  filter  to 
be  encoded  in  a  special  way.  Therefore,  the  details  of  th‘3 
type  of  approach  only  are  referenced  in  the  next  section. 

Optical  KBS  Research 

Several  researchers  have  investigated  various  aspects  of 
optical  knowledge  base  systems,  particularly  for  inference 
applications  [5-8,18-29].  In  this  section,  .several  of  the  ap¬ 
proaches  taken  are  summarized  in  terms  of  the  data  repre¬ 
sentations  employed,  the  focus  of  the  work  with  respect  to 
optical  KBS’s,  and  the  relevant  optical  architectures  and  is¬ 
sues.  The  nomenclature  used  in  the  literature  is  retained 
here  and  related  to  the  general  framework  developed  above. 

The  review  focuses  on  inference  machines  since  most  of 
the  research  efforts  have  been  directed  towards  this  area.  In 
this  type  of  KBS,  the  system  is  presented  with  a  query  and 
then  conclusions  are  derived  from  the  relationships  in  the 
knowledge  base  using  deductive  inference.  The  conclusions 
can  be  either  yes/no  responses  or  the  set  of  symbols  that 
satisfy  the  query. 

Warde  and  Kottas  (5)  present  two  architectures  for  an 
optical  inference  machine  based  on  a  simplified  implementa¬ 
tion  of  PROLOG.  Here,  the  relationships  are  used  to  con¬ 
struct  a  set  of  facts  and  rules  about  data  objects  (the  sym¬ 
bols  in  Ref.  5).  One  architecture,  shown  in  Fig.  7,  stores 
the  relationships  of  a  knowledge  base  in  a  diffractive-type 
memory  and  uses  an  optical  correlator  as  a  matched  filter  to 
infer  new  symbols  from  previous  ones.  The  relationships  are 
stored  holographically  on  0-SLM  1  via  IvSL.M  1  and  can  be 
updated  by  the  electronic  controller  as  iieciled.  The  input 
plane,  filter  plane,  and  output  plane  of  ihe  matched-filter 
correlator  arc  Pi,  I’a.  and  Pj,  rcsperlively  Alternatively, 
plane  Pe  at  the  input  to  0-SLM  2  could  lx-  used  as  the  cor¬ 
relator  output  plane.  In  this  case,  O-SL.M  2  can  be  used  to 
perform  additional  processing  on  the  image  representing  the 
inferred  symbols.  Examples  of  such  processing  include  ac¬ 
cumulating  sequences  of  inferred  symbols  and  then  combin¬ 
ing  them  using  thresholding  and/or  logic  operations  (AND’s, 
Oil’s,  etc.). 

The  second  architecture,  shown  in  Fig.  8.  implements  a 
relationship-matrix  form  of  the  knowledge  base  using  optical 
mapping  templates.  These  templates  arc  binary  transmis¬ 
sion  masks  that,  when  used  in  an  inner  product  processor, 
can  infer  a  new  vector  of  symbols  from  a  given  vector  of  sym¬ 
bols  The  input  symbols  are  written  In  an  expanded  form  on 
E  SL.M  1  by  the  electronic  controller  and  the  mapping  tem- 
plate.s  are  stoied  on  E  SLM.2.  In  effect,  the  tc.mplates  cause 
C  SLM  2  to  become  a  modifiable  cross  bar  switch.  The  in- 
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Figure  8:  Mapped-template  optical  inference  machine  (from 
Ref.  5). 


ncr  product  is  formed  by  using  the  expanded  vector  of  input 
symbols  from  E-SLM  1  to  read  out  the  current  mapping  tem¬ 
plate  on  E-SLM  2.  0-SLM  1  thresholds  the  output  which 
then  can  be  latched  into  0-SLM  2  for  further  processing 
or  accumulation  of  results.  The  electronic  controller  has  a 
more  active  role  in  this  architecture  than  in  tlie  rnatched- 


filter  architecture  because  both  the  input  symbols  and  map¬ 
ping  templates  need  to  be  updated  as  often  as  dictated  by 
the  inference  problem  being  solved. 

Jau  et  al.  (8)  also  have  considered  optical  expert  systems 
from  a  PROLOG  viewpoint  and  have  investigated  an  ap¬ 
proach  based  on  relationship  matrices  that  is  similar  to  the 
mapping  templates  of  Warde  and  Kottas  [5].  They  develop  a 
method  for  combining  binary  relationship  matrices  (fact  ma¬ 
trices)  using  matrix  algebra  into  new  relationships,  allowing 
more  complex  rules  to  be  generated  out  of  the  basis  set  of 
relationships  and  symbols  (the  facts  in  the  knowledge  base). 
However,  they  extend  this  formalism  by  presenting  an  algo¬ 
rithm  for  updating  the  relationship  matrices  via  an  update 
rule  when  new  information  is  available.  The  proposed  archi¬ 
tecture  is  a  general  optoelectronic  system  based  on  optical 
matrix-vector  and  matrix-matrix  multipliers. 

McAulay  (18)  uses  a  probabilistic  relationship  matrix  to 
develop  a  forward-inference  architecture  for  a  real-time  di¬ 
agnostic  expert  system.  In  this  type  of  system,  the  input 
symbols  are  a  set  of  events  or  conditions  and  the  output 
symbols  are  a  collection  of  hypotheses.  A  query  consists  of 
a  particular  set  of  input  events  and  the  output  conclusion  is 
the  probability  that  each  hypothesis  is  true.  In  the  medical 
expert  system  described  by  McAulay,  the  input  conditions 
are  symptoms  and  the  output  hypotheses  are  illnesses. 

The  architecture  is  shown  in  Fig.  9.  Binary  input  symbols 
(called  events  in  Ref.  18)  are  presented  to  the  system  one  at 
a  time  on  the  1-D  SLM  (left  side  of  Fig.  9).  This  input  is 
split  between  parallel  channels,  each  of  which  forms  an  inner 
product  processor.  The  2-D  SLM’s  in  each  channel  store  a 
relationship  matrix.  In  one  channel,  the  matrix  is  the  set 
of  a  priori  probabilities  that  an  input  event  corresponds  to 
a  particular  outcome,  and  the  matrix  in  the  other  channel 
stores  the  probability  of  an  event  occurring  in  the  absence  of 
the  particular  Outcome. 

The  outputs  of  these  channels  are  delected  by  1-D  CCD’s 
and  combined  in  a  set  of  N  parallel  processors  to  form  a  set 
of  0  posteriori  probabilities.  These  processors,  in  conjunc¬ 
tion  with  another  inner  product  processor  configured  as  a 
do  ubling  summer,  '•ompule  the  updated  Bayesian  probabili¬ 
ties  for  each  outcome  (hypothesis)  as  each  addiiional  event  is 
given  as  input.  Once  all  events  have  been  processed,  the  out¬ 
put  of  the  final  1-D  CCD  array  (on  the  nglii  side  of  Fig.  9) 
contains  the  final  probabilities  of  all  outcomes  given  all  the 
events. 

Eichmann  and  Caulfield  (19)  consider  the  same  type  of 
problem  as  McAulay,  although  in  a  different  context.  They 
present  two  methods  for  determining  the  elements  of  a  rela¬ 
tionship  matrix  which  will  aid  in  making  decisions.  The  in¬ 
put  symbols  are  in  the  form  of  a  binary  knowledge  vector  that 
contains  the  answers  to  several  yes/no  questions  (the  events 
of  McAulay  [18]).  The  output  is  cither  a  binary  answer  vec¬ 
tor  (one  method)  or  a  set  of  a  posteriori  probabilities  (the 
second  method)  indicating  the  inferred  conclusions  (which 
hypotheses  are  true).  Both  methods  are  based  on  Bayesian 
principles  and  optimal  Gaussian  classifiers,  and  an  algorithm 
for  incrementally  updating  the  relationship  matrix  elements 


Figure  9:  Optical  architecture  for  a  real-time  diagnostic  ex¬ 
pert  system  (from  Ref.  18). 

(the  weights)  is  prescribed.  The  implied  optical  architec¬ 
ture  utilizes  threshold  logic  units  in  the  conventional  optical 
matrix-vector  and  matrix-matrix  multipliers  (e.g.,  the  inner 
product  processor  iniFig.  6). 

Szu  and  Caulfield  (20)  employ  relationship  matrices  as 
associative  memories.  An  interesting  aspect  of  their  optical 
representation  for  the  relationship  matrices  is  that  each  ma¬ 
trix  value  is  represented  by  a  2-D  binary  submatrix  rather 
than  a  single  transmittance.  This  submatrix  allows  particu¬ 
lar  attributes  of  a  symbol  to  be  incorporated  into  the  knowl¬ 
edge  base,  although  it  requires  more  space  on  an  SLM.  They 
propose  to  input  queries  using  SLM’s  and  to  store  the  re¬ 
lationship  matrices  in  page-oriented  holographic  memories. 
The  paged  memory  uses  many  holograms,  each  of  which 
stores  a  subset  of  the  total  knowledge  base.  This  method 
al’ows  the  knowledge  bese  to  be  increased  by  simply  adding 
holograms  to  the  system. 

Haney  e<  al.  (21)  have  investigated  optical  techniques  for 
increasing  the  efficiency  of  heuristic  searches.  They  use  bi¬ 
nary  constraint  matrices  (another  form  of  relationship  ma¬ 
trix)  as  a  means  of  pruning  a  sear-h  tree  through  the  knowl¬ 
edge  base  before  or  during  the  search.  In  general,  a  con¬ 
straint  matrix  is  a  binary-valued  array  indicating  a  special¬ 
ized  collection  of  facts  which  relate  multiple  sets  of  symbols. 
Binary  constraint  matrices  focus  on  two  sets  of  symbols. 

In  Ref.  21,  Haney  ct  at.  devcloo  a  set  of  ’  inary  con¬ 
straint  matrices  using  a  consistent  labeling  problem  as  an 
ex<imple.  In  this  type  of  problem,  there  arc  N  units  (one  set 
of  symbols),  each  of  which  is  to  be  assigned  one  of  L  labels 
(the  second  set  of  symbols).  Let  the  units  be  denoted  l>y 

«i,«2,...,iiA  rtnd  the  labels  by  fi,/j . ft.  Furthermore, 

let  /{ be  a  relationship  that  associates  units  with  labels.  Now, 
a  set  of  facts  can  be  generated  in  the  dyadic  form  “u,  R 
and  can  be  interpreted  as  “Unit  i  is  associated  with  label 
m  through  relationship  R."  A  constraint  matrix  R(,i,j)  (in 
Ref.  21)  is  an  L  X  L  matrix  with  binary  elements  r,n„  such 
that  r„n  =  1  when  rhe  facts  “u,  R  Im"  and  “u^-  R  Ir.”  are  con¬ 
sistent  with  (i.e.,  satisfy)  other  constraints,  such  as  each  unit 
must  have  a  difierenl  label.  These  additional  constraints  are 
actually  higher  level  relationships  which  depend  upon  the 
units,  the  labels,  and  the  basic  association  relationship  R. 
If  r„„  =  0,  both  facts  above  cannot  be  true  simultaneously, 
and  any  other  conclusions  that  would  depend  upon  them  can 
be  ignored. 

With  this  formalism,  the  pruning  of  a  search  tree  is  cquiv- 


Figure  10;  Procedural-based  optical  inference  processor 
(from  Ref.  23). 


alent  to  a  forward  search  through  the  constraint  matrices. 
This  can  be  done  using  an  optical  matrix-vector  multiplier. 
To  eliminate  the  j***  unit,  the  rows  of  R(i,j)  are  multiplied  by 
the  matrix  R{j,  k)  to  form  the  rows  in  the  new  (and  stronger) 
constraint  matrix  R'{i,'k).  This  process  can  be  repeated  to 
reduce  effectively  the  size  of  the  search  tree  that  needs  to  be 
traversed. 

Casasent  and  his  colleagues  (6,22-25)  have  investigated 
the  use  of  knowledge  base  processing  techniques  for  ob¬ 
ject  recognition,  identification,  and  classification.  These 
researchers  utilize  an  assortment  of  different  architectures 
based  on  the  optical  correlator  shown  in  Fig.  10.  The  in¬ 
put  to  this  system  (at  plane  Pi)  is  an  image  of  objects  to 
be  recognized  rather,  than  an  encoded  vector  or  matrix  of 
symbols.  A  holographic  filter  containing  a  set  of  frequency- 
multiplexed  MSF’s  for  the  desired  objects  is  placed  in  the 
filter  plane  (Pj).  The  spatial  carrier  frequencies  cause  the 
output  correlations  for  the  objects  to  appear  on  different  de¬ 
tectors  in  the  output  plane  (P3)  (23). 

The  symbolic  logic  processor  (unit)  after  plane  P3  is  used 
when  various  features  of  the  objects  are  used  to  make  the 
MSF’s  instead  of  the  objects  themselves.  The  electronic  feed¬ 
back  loop  from  the  symbolic  logic  processor  to  the  filter  plane 
allows  various  filters  to  be  synthesized  and  used  to  analyze 
the  image.  As  a  result,  more  structured  relationships  such  as 
"Object  .A  has  all  of  the  features  of  object  B  but  none  of  the 
features  of  object  C"  can  be  processed.  Sinc'-.  in  this  exam¬ 
ple,  the  features  for  B  and  C  must  be  examined  first  before  a 
determination  about  A  can  be  made,  this  at'  hilectiire  imple¬ 
ments  a  procedural-like  algorithm  to  object  recognition.  The 
symbolic  logic  processor  can  be  implemented  using  either  an 
electronic  controller  or  a  more  sophisticated  arrangement  of 
optical  correlators. 

In  the  other  work  (6,22,24,25),  Casasent  and  his  col¬ 
leagues  develop  other  data  representations  based  on  directed 
and  relational  graphs  (both  forms  of  relationship  matri¬ 
ces)  for  performing  object  recognition  using  optical  knowl¬ 
edge  base  processing  techniques.  Furthermore,  these  au¬ 
thors  present  alternative  architectures  such  as  space-  a.nd 
time-integrating  optical  processors  for  directed  graphs  (6). 
In  another  approach,  multiple  optical  correlators  arc  used 
to  implement  a  production  system  (a  type  of  inference  ma¬ 
chine)  using  both  neural  network  and  symbolic  substitution 
idcM  (26).  For  further  detail.,,  the  interested  reader  is  di¬ 
rected  to  the  references  cilea  above. 
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Figure  11.  Block  diagram  of  an  optical  resolution  system 
(from  Ref.  29). 


A  rather  novel  approach  to  making  inferences  from  sym¬ 
bolic  information  is  being  investigated  by  Schmidt  and 
Cathey  (27-29).  They  are  examining  the  use  of  mathemat¬ 
ical  resolution  to  make  inferences  as  a  means  of  avoiding 
the  often-large  dynamic  data  requirements  normally  found  in 
artificial  intelligence  problems.  This  approach  is  unique  in 
that  mathematical  resolution  is  a  “proof-by-conlradiction” 
method  of  inference. 

Given  a  statement  whose  truth  is  in  question  (the  query), 
the  process  of  mathematical  resolution  proves  that  the  state¬ 
ment  is  true  by  showing  that  the  negation  of  the  statement 
contradicts  the  axioms  (the  facts  and  rules  arranged  from 
the  relationships  and  symbols  in  the  knowledge  base)  This 
involves  accepting  an  input  query  in  the  form  of  a  binary  vec¬ 
tor  and  combining  it  with  the  literals  (the  facts  or  axioms  in 
Refs.  27-29,  also  represented  by  binary  vectors).  The  rules 
of  the  knowledge  base  are  used  to  test  the  resulting  vec 
tors  for  tautologies.  These  tautology  vectors  can  be  teduced 
further  by  eliminating  nontautological  vectors  (conclusions 
that  contradict  the  knowledge  base  information)  and  useless 
or  duplicate  tautologies.  This  process  may  require  several 
iterations. 

.A  block  diagram  of  an  optical  resolution  system  is  shown 
ill  Fig.  II.  It  combine.-,  five  dilTerent  subsystems. 

1.  A  .sl.ick  memory  with  parallel  access  for  storing  new 
result  vectors. 

2.  A  processor  to  combine  vectors  in  parallel. 

3.  A  processor  to  reject  nontautological  vectors  in  paral¬ 
lel. 

1.  A  processor  to  eliminate  duplicate  tautological  vectors 
in  parallel. 

5.  A  controller  to  monitor  '.he  vectors  to  end  the  iteration 
Schmidt  in  Ref.  29  presents  optical  architectures  for  imple 
menting  these  sub._  ..terns.  He  has  simulated  the  system  on  a 
sample  inference  problem  and  compared  it  to  a  conventional 
serial  approach  and  has  learned  that  the  identification  and 
elimination  of  tautologies  (step  4  above)  is  computationally 
the  most  significant  step  of  the  optical  resolution  process. 

A  very  different  approach  to  symbolic  knowledge  process 
ing  is  considered  by  Derstine  and  Guha  (7).  They  propose 


88 


Figure  12:  Schematic  of  the  SPARO  architecture  for  an  op¬ 
tical  finite  state  machine  (from  Ref.  7). 


an  optical  architecture  called  SPARO  (Symbolic  Processing 
ARchitecture  in  Optics)  to  implement  a  symbolic  processing 
language.  Their  work,  along  with  that  of  other  researchers 
already  mentioned  (Warde  and  Kottas  (5)  and  Jau  el  al.  (8)), 
was  influenced  by  the  PROLOG  language.  In  particular, 
Derstine  and  Guha  consider  PARLOG,  a  version  of  PRO¬ 
LOG  in  which  predicates  are  evaluated  in  parallel  rather  than 
serially. 

The  SPARO  architecture  is  intended  to  solve  a  particular 
symbolic  processing  problem,  that  of  combinator  graph  re¬ 
duction  in  pure  functional  languages  such  as  PARLOG.  This 
process  involves  reducing  a  combinator  graph  (similar  to  a 
binary  graph)  to  its  simplest  form  that  is  consistent  with  the 
relationships  in  the  knowledge  base. 

The  SPARO  architecture  is  illustrated  in  Fig.  12  and 
the  corresponding  optical  data  path  in  Fig.  13.  It  is  a 
type  of  optical  finite  state  machine  (OFSM)  formed  by  lay¬ 
ers  of  processing  elements  (optical  logic  gales)  and  optical 
interconnects.  Sets  of  symbolic  substitution  optics  in  con¬ 
junction  with  the  interconnections  effectively  implement  the 
“microcode"  between  processor  elements  for  performing  the 
graph  reduction  in  parallel.  As  shown  in  Fig.  13,  the  opti¬ 
cal  data  path  is  decomposed  into  an  array  of  areas,  one  for 
each  processor  node.  A  node  is  a  colleciion  of  optical  bits 
that  make  up  the  stale  of  the  node  (local  memory)  and  an 
interconnection  register  (analogous  to  a  pointer  in  a  software 
data  structure).  There  is  no  external  oi  addressable  memory 
system  here. 

With  each  iteration  around  the  loop  (equivalent  to  one 
machine  cycle),  the  processor  nodes  arc  updated  and  the 
new  state  information  is  broadcast  to  the  appropriate  desti¬ 
nation  nodes.  In  principle,  it  should  be  possible  to  reprogram 
the  system  to  perform  different  functions  by  redesigning  the 
symbolic  substitution  optics  and  the  interconnects 

Alternative  Approaches 

The  idea  of  using  symbolic  substitution  as  a  means  of 
doing  optical  processing  is  not  new.  However,  the  work  by 
Derstine  and  Guha  [7]  represents  one  of  the  first  attempts  al 
using  symbolic  substitution  specifically  in  an  optical  KBS. 
Casasent  and  Botha  (30)  also  have  considered  symbolic  sub¬ 
stitution  in  the  context  of  multifunctional  optical  processing 
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Figure  13:  Optical  data  path  for  the  SPARO  architecture  at 
the  main  processor  plane  in  Fig.  12  (from  Ref.  7). 

systems  based  on  multiple  optical  correlators.  Several  other 
researchers  have  been  working  on  optical  architectures  for 
performing  general  symbolic  substitution  [31-34).  Work  in 
the  field  should  be  encouraged  since  this  method  is  another 
way  to  implement  relationships  between  symbols. 

Furthermore,  the  theories  of  connectionistic  (neural  net¬ 
work)  computing  offer  several  opportunities  for  optical 
KBS’s.  There  currently  is  considerable  work  being  done  on 
optical  neural  networks  (35).  Given  that  neural  networks 
can  be  trained  to  perform  a  wide  variety  of  tasks,  this  ap¬ 
proach  has  the  exciting  possibility  that  the  knowledge  base 
can  be  learned  instead  of  programmed.  Botha  ei  al.  (26) 
have  started  to  look  at  a  combination  of  neural  networks 
and  symbolic  substitution  for  optical  inference  systems. 

Other  approaches  include  developing  alternative  data 
representations.  For  example,  Kottas  and  Warde  (36)  are 
examining  various  neural  network  methods  for  incorporat¬ 
ing  the  time  domain  into  the  representations  of  symbolic 
information.  This  is  in  contrast  to  the  relationship  matrix 
formalism  in  which  the  symbols  are  represented  by  static  po¬ 
sitions  in  the  matrices.  One  potential  advantage  of  this  ap¬ 
proach  is  that  the  system  could  make  use  of  its  state  space 
more  efficiently  al  the  expense  of  slower  data  access.  This 
could  allow  larger  knowledge  bases  to  be  considered  iisini' 
current  optical  device  technology.  Preliminary  results  show 
that  the  methods  being  developed  actually  could  lake  ail 
vantage  of  any  imperfections  or  irregularities  in  the  opiii-al 
devices.  Ncverlheless,  different  data  representation  shonbf 
be  considered  in  fulure  designs  of  optical  KBS’s. 

StrenKths  and  Limitations 

Two  of  the  advantages  of  an  optical  KBS  implcmenla 
tion  result  from  the  use  of  holographic  storage  techniques 
and  optical  interconnects  between  processing  stages,  The 
holographic  storage  offers  large  capacity,  fault  tolerance,  and 
parallel  searching  methods.  Optical  interconnects  permit 
large  numbers  of  connections  between  pixels  in  two  process 
ing  planes  with  low  crosstalk. 

On  the  other  hand,  the  development  of  practical  optical 
KBS’s,  particularly  inference  machines,  is  limited  prim.ar 
ily  by  the  SLM’s.  Although  some  are  commercially  avail 
able  and  many  more  are  under  development,  current  SLM's 
have  relatively  low  framing  rales  and  resolutions.  Thus 
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while  the  optical  propagation  delay  between  devices  is  neg¬ 
ligible,  the  processing  delay  within  an  SLM  is  significant. 
This  also  can  cause  input/output  bottlenecks,  particularly  in 
the  electrical-to-optical  and  optical-to-electrical  conversions. 
Because  of  their  limited  resolution  and  finite  device  size, 
SLM’s  can  supjjort  only  small  numbers  of  symbols,  either 
for  a  database  or  a  knowledge  base.  As  a  result,  the  optical 
KBS  architectures  described  above  are  usually  slower  than 
their  software  counterparts.  However,  when  large  numbers 
of  symbols  can  be  supported  by  the  SLM’s,  optical  KBS’s 
could  become  feasible  and  cost  effective. 

SUMMARY 

Optical  knowledge  base  systems  are  still  in  the  research 
stage.  In  the  area  of  knowledge  representation,  several  en¬ 
coding  schemes  have  been  developed,  although  most  rely  on 
some  sort  of  matrix  formalism.  In  the  optics  realm,  vari¬ 
ous  system  architectures  have  been  proposed  to  implement 
these  enccdingESchemes.  These  architectures  are  based  pri¬ 
marily  on  optical  inner  product  processors  (such  as  matrix- 
vector  multipliers)  and  optical  correlators.  Most  of  them 
have  been  investigatated  with  theoretical  analyses  and/or 
computer  simulations.  To  date,  few  of  them  have  been  built 
using  real  optical  hardware.  Because  of  device  limitations, 
particularly  with  spatial  light  modulators,  these  implemen¬ 
tations  are  restricted  to  relatively  small  knowledge  bases  and 
simple  symbolic  processing  problems  (like  forward  inference). 
With  improved  devices,  these  architectures  could  prove  to 
practical  when  large  knowledge  bases  are  considered. 

In  the  meantime,  new  methods  for  representing  knowl¬ 
edge  and  more  advanced  architectures  are  needed  that  can 
utilize  current  device  technology.  Fortunately,  alternative 
paths  for  research  exist  through  areas  such  as  optical  neu¬ 
ral  networks  and  optical  symbolic  substitution  techniques. 
The  future  offers  exciting  potential  for  the  development  of 
knowledge  base  processing  systems  based  on  optical  imple¬ 
mentation  technologies. 
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